On Wed, 2013-07-17 at 14:31 +0200, Alexander Graf wrote: > On 17.07.2013, at 13:15, Benjamin Herrenschmidt wrote: > > > On Wed, 2013-07-17 at 19:46 +1000, Alexey Kardashevskiy wrote: > >> Current hcd-ohci does not handle DMA errors which can actually > >> happen. > >> > >> However it is not clear what approach should be used here - > >> for example, get_dwords returns positive number saying that there > >> is no error as all the callers consider the return value as fail > >> if it is less than zero. Normally you would expect bool=true/int=0 > >> as success and bool=false/int=-1 as fail. > >> > >> Any suggestion? > > > > The right thing to do is not only to bring the error up the stack, but > > essentially to set the error bits in the PCI command status and put the > > whole HCI in error state (and stop operating) > > > > That how real HW reacts. > > Who does that? I always assumed it's the IOMMU that kills the device > when it accesses regions it's not allowed to access.
Hah, no, iommu's only "kill devices" on fancy HW like powerpc :-) On these, when any kind of error occur, we isolate the entire thing. > On real hardware, memory transfers don't have error return codes, do they? No they sort-of do :-) For example, on PCI, there are 3 common causes of errors: Parity, Target Aborts and Master Aborts. The former is somewhat obvious, the second means the target aborted the cycle before completion, the latter usually means no target responded (timeout). There are two physical lines used to convey error informations (and potentially abort cycles), PERR and SERR. Depending on the details of the bus protocol, the error causes can be a bit different. On PCIe you can actually shoot error messages up the link, transactions are packets and can result in an error response, etc... Since qemu mostly emulates PCI, let's stick to that. An iommu error will typically be a target abort. So the device should react as such. A typical O/EHCI will stop operating, set itself into error state (which can be queried by MMIO) and will set something like PERR in its config space to signal that it got an error. Cheers, Ben.