Hi James,

On Wednesday, January 23, 2019 10:36 AM, James Morse wrote:
> When the time comes, could you post a dt-binding as the first patch? These 
> add the documentation under Documentation/device-tree/bindings, and need to 
> be CC'd to the DT folks.

Sure, will do.

>> Will change scrub_mode to HW.

> Do we know its enabled? This is something firmware has to set up, someone 
> else's platform may do it differently.
> I think should read one of the scrub control registers to find out if its 
> turned on.
> But, I can't find what uses this value ...

Looks like if scrub_mode is set to SCRUB_SW_SRC, the MC core will do arch 
specific sw scrub on the page with ce error. 
We can config the mode base on register settings.

>>>> +     /* Enable interrupts */
>>>> +     dmc520_write_reg(edac,
>>>> +                      DRAM_ECC_INT_CE_MASK | DRAM_ECC_INT_UE_MASK,
>>>> +                      REG_OFFSET_INTERRUPT_CONTROL);

>>> What if they were enabled before? (e.g. enabled by firmware, the 
>>> bootloader or kdump). If they're already enabled, can we race with 
>>> the interrupt handler on another CPU and get double reporting of the error 
>>> counter?

>> We don't expect interrupts to be enabled by firmware or bootloader if 
>> this driver is enabled.

> What about a previous instance of this driver? Linux supports kexec, kdump 
> and hibernate, all of which cause us to inherit a slightly used platform.

Agreed. We shouldn't make assumption on the register value when driver starts.

>> If firmware enables it, they're suppose to handle the interrupt.

> Ah, so you still have resident firmware!
> How come your firmware trusts linux not to turn off the memory controller?!
> These things are usually protected by trust zone so the OS can't pull the 
> memory from under firmware's feet.

We have firmware to config the memory controller and want to have an EDAC 
driver to report ECC status.
Could you please elaborate a bit on the security concern on this approach? Like 
some malicious app/driver can access
memory controller registers can cause issue? 

What's the recommend approach if Linux won't be able to access memory 
controller registers? Have firmware do the ECC 
status monitoring and some sort of driver to query ECC status from firmware?

Appreciate your insights on this.

Thanks,
Rui

Reply via email to