On Thu, 17 Mar 2016 13:33:30 -0700 Joe Perches <j...@perches.com> wrote:
> On Thu, 2016-03-17 at 14:12 -0600, Alex Williamson wrote: > > Fault rates can easily overwhelm the console and make the system > > unresponsive. Ratelimit to allow an opportunity for maintenance. > [] > > diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c > [] > > @@ -1602,10 +1602,17 @@ irqreturn_t dmar_fault(int irq, void *dev_id) > > int reg, fault_index; > > u32 fault_status; > > unsigned long flag; > > + bool ratelimited; > > + static DEFINE_RATELIMIT_STATE(rs, > > + DEFAULT_RATELIMIT_INTERVAL, > > + DEFAULT_RATELIMIT_BURST); > > Are these the appropriate limits for dmar? > > include/linux/ratelimit.h:#define DEFAULT_RATELIMIT_INTERVAL (5 * HZ) > include/linux/ratelimit.h:#define DEFAULT_RATELIMIT_BURST 10 They seem OK to me, I've got a test running that continuously generates DMA read faults and I get 20 lines of log every 5 seconds. That seems like enough to know there's an issue, it's ongoing, and maybe see some patterns in the fault addresses. I expect we could turn up the burst value but generally when I'm looking at the logs I'm only looking for things like is it a single target address, is it a sequential address, or what's the general address space to know if it should or should not be a valid fault address. Thanks, Alex