On 04/14/2017 08:45 AM, Alexander Duyck wrote:
On Thu, Apr 13, 2017 at 11:12 AM, Ben Greear <gree...@candelatech.com> wrote:
Hello,

I have been seeing a regular occurrence of DMAR errors, looking something
like this when testing my ath10k driver/firmware under some specific loads
(maximum receive of 512 byte frames in AP mode):

DMAR: DRHD: handling fault status reg 3
DMAR: [DMA Read] Request device [05:00.0] fault addr fd99f000 [fault reason
06] PTE Read access is not set
ath10k_pci 0000:05:00.0: firmware crashed! (uuid
594b1393-ae35-42b5-9dec-74ff0c6791ff)

So, I am wondering if there is any way I can get more information about what
this fd99f000 address
is?

Once this problem hits, the entire OS locks hard (not even sysrq-boot will
do anything),
so I guess I would need the DMAR logic to print out more info on that
address somehow.

Thanks,
Ben

There isn't much more info to give you. The problem is that the device
at 5:00.0 attempted to read at fd99f000 even though it didn't have
permissions. In response this should trigger a PCI Master Abort
message to that function. It looks like the firmware for the device
doesn't handle that and so that is likely why things got hung.

Really you would need to interrogate the ath10k_pci to see if there
is/was a mapping somewhere for that address and what it was supposed
to be used for.

I'm working on a hook in DMAR logic to call into ath10k_pci when the
error is seen, so the ath10k can dump debug info, including recent DMA
addresses.

My code is an awful hack so far, but if someone could add a clean way to 
register
DMAR error callbacks, I think that would be very welcome.  It might could tie 
into
automated dma map/unmap debugging logic, and at the least, someone could write 
custom debugging callbacks
for the driver(s) in question.

Thanks,
Ben


- Alex


--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Reply via email to