* John Snow (js...@redhat.com) wrote: > > > On 12/09/2014 01:15 PM, Dr. David Alan Gilbert (git) wrote: > >From: "Dr. David Alan Gilbert" <dgilb...@redhat.com> > > > >(With the previous atapi_dma flag recovery) > >If migration happens between the ATAPI command being written and the > >bmdma being started, the DMA is dropped. Eventually the guest times > >out and recovers, but that can take many seconds. > >(This is rare, on a pingpong reading the CD continuously I hit > >this about ~1/30-1/50 migrates) > > > >I don't think we've got enough state to be able to recover safely > >at this point, so I throw a 'medium error, no seek complete' > >that I'm assuming guests will try and recover from an apparently > >dirty CD. > > > >OK, it's a hack, the real solution is probably to push a lot of > >ATAPI state into the migration stream, but this is a fix that > >works with no stream changes. Tested only on Linux (both RHEL5 > >(pre-libata) and RHEL7). > > > >Signed-off-by: Dr. David Alan Gilbert <dgilb...@redhat.com> > >--- > > hw/ide/atapi.c | 17 +++++++++++++++++ > > hw/ide/internal.h | 2 ++ > > hw/ide/pci.c | 11 +++++++++++ > > 3 files changed, 30 insertions(+) > > > >diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c > >index c63b7e5..e17799c 100644 > >--- a/hw/ide/atapi.c > >+++ b/hw/ide/atapi.c > >@@ -394,6 +394,23 @@ static void ide_atapi_cmd_read(IDEState *s, int lba, > >int nb_sectors, > > } > > } > > > >+ > >+/* Called by *_restart_bh when the transfer function points > >+ * to ide_atapi_cmd > >+ */ > >+void ide_atapi_dma_restart(IDEState *s) > >+{ > >+ /* > >+ * I'm not sure we have enough stored to restart the command > >+ * safely, so give the guest an error it should recover from. > >+ * I'm assuming most guests will try to recover from something > >+ * listed as a medium error on a CD; it seems to work on Linux. > >+ * This would be more of a problem if we did any other type of > >+ * DMA operation. > >+ */ > >+ ide_atapi_cmd_error(s, MEDIUM_ERROR, ASC_NO_SEEK_COMPLETE); > >+} > >+ > > Is this safe for non-data commands? Can we even get there in such a case?
See below. > > static inline uint8_t ide_atapi_set_profile(uint8_t *buf, uint8_t *index, > > uint16_t profile) > > { > >diff --git a/hw/ide/internal.h b/hw/ide/internal.h > >index 8a3eca4..8b65285 100644 > >--- a/hw/ide/internal.h > >+++ b/hw/ide/internal.h > >@@ -289,6 +289,7 @@ typedef struct IDEDMAOps IDEDMAOps; > > #define ATAPI_INT_REASON_TAG 0xf8 > > > > /* same constants as bochs */ > >+#define ASC_NO_SEEK_COMPLETE 0x02 > > #define ASC_ILLEGAL_OPCODE 0x20 > > #define ASC_LOGICAL_BLOCK_OOR 0x21 > > #define ASC_INV_FIELD_IN_CMD_PACKET 0x24 > >@@ -529,6 +530,7 @@ void ide_dma_error(IDEState *s); > > > > void ide_atapi_cmd_ok(IDEState *s); > > void ide_atapi_cmd_error(IDEState *s, int sense_key, int asc); > >+void ide_atapi_dma_restart(IDEState *s); > > void ide_atapi_io_error(IDEState *s, int ret); > > > > void ide_ioport_write(void *opaque, uint32_t addr, uint32_t val); > >diff --git a/hw/ide/pci.c b/hw/ide/pci.c > >index bee5ad3..e3f2054 100644 > >--- a/hw/ide/pci.c > >+++ b/hw/ide/pci.c > >@@ -235,6 +235,17 @@ static void bmdma_restart_bh(void *opaque) > > } > > } else if (error_status & IDE_RETRY_FLUSH) { > > ide_flush_cache(bmdma_active_if(bm)); > >+ } else { > >+ IDEState *s = bmdma_active_if(bm); > >+ > >+ /* > >+ * We've not got any bits to tell us about ATAPI - but > >+ * we do have the end_transfer_func that tells us what > >+ * we're trying to do. > >+ */ > >+ if (s->end_transfer_func == ide_atapi_cmd) { > >+ ide_atapi_dma_restart(s); > >+ } > > OK, so when the restart routines get invoked we add a hook to see if we were > in the middle of an ATAPI command and acknowledge that we don't know how to > properly handle this. As to your qeustion above about non-data commands; hmm probably - but how do I guard it? I guess I could check for the atapi_dma flag the previous patch fixed. (This is all probably still broken for non-DMA atapi transfers) > Isn't this going to run on every vmstate change, though? There aren't many - only starting/stopping the CPU does it; and bmdma_restart_cb guards it by 'if (!running)' exit, so it'll only do it when the CPU starts running again. > I think we don't > clear out end_transfer_func on success, so this might fire off more than we > want it to, although I guess end_transfer_func is usually going to get set > to ide_atapi_cmd_reply_end if it finishes normally ... Right, or if ide_transfer_stop is called. > > } > > } > > > > > > Indeed a hack, but it's probably appropriate: if our code cannot in fact > handle ATAPI migration, throwing an error or disabling migration is the > correct thing to do, but I don't think users would be very happy with the > second option. I feel that this is an OK workaround because it should not > introduce spurious errors or retries for cases where we manage to avoid > migrating in the middle of the loop. This will at least let the currently > broken case limp along until we fix it more properly. > > What makes me the most curious is how this plays out in Windows if this case > is triggered. Throw a trace around the fake error and see if you can't > observe it getting called during a pingpong test while Windows reads a CD. Yeh, I'm going to figure out how to try that. Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK