* Jeremy Kerr <j...@ozlabs.org> [2019-10-04 11:27:46]:

> Hi Vaidy,
> 
> > The current topic is who owns setting up the ATT bits for that piece
> > of memory.  It is the kernel today.  Kernel decides to set this up as
> > normal memory or I/O memory and sets the bits in page table entry.
> > 
> > > Or, what if there's a range of address-space that isn't backed by system
> > > RAM (say, some MMIO-mapped hardware) that we want to expose to a future
> > > HBRT implementation? This change will block that.
> > > 
> > > The kernel doesn't know what is and is not valid for a HBRT mapping, so
> > > it has no reason to override what's specified in the device tree. We've
> > > designed this so that the kernel provides the mechanism for mapping
> > > pages, and not the policy of which pages can be mapped.
> > 
> > The features altered are cache inhibit and guarding which affects
> > ability to fetch instructions.  If we allow HBRT to reside in an I/O
> > memory, the we need to tell kernel that it is ok to allow caching and
> > instruction execution in that region and accordingly change the ATT
> > bits.
> 
> But this isn't only about the HBRT range itself (ie., the memory containing
> the HBRT binary). Everything that HBRT needs to map will come through this
> path. We may not need to fetch instructions from those ranges.

Correct. Only HBRT code cannot be fetched, but data can be mapped and
used.  However cache inhibit mapping is unnecessary unless explicitly
requested by firmware.

> > This patch does not block a working function, but actually makes
> > debugging a failed case easier.  The failing scenario without this
> > check is such that HBRT cannot fetch from the region of memory and
> > loops in minor page faults doing nothing.
> 
> Yep, that's not great, but the alternative means applying this kernel
> policy, which we can't guarantee is correct.
> 
> That is, unless the page protection bits mean that this won't work anyway,
> but we can probably fix that without a kernel policy, by applying the
> appropriate pgprot_t, perhaps.

Currently if we allow the mapping, it won't work at all for code.  But
can work for data pages.  However cache inhibit mapping will cause
side effects apart from poor performance.  We should not do this
unless firmware has a reason to request for this.  We should pass
additional info from OPAL to driver to set/override permission.

> > As Vasant mentioned hostboot team will add code to relocate the HBRT
> > to the right place.  Addressing your concern, if we end up allowing
> > HBRT in non system-RAM area
> 
> Not just HBRT, but anything that HBRT maps too.

Correct.

> > we need to add some more flags in device
> > tree to instruct the driver to force change the page protection bits
> > as page_prot = pgprot_cached(page_prot);
> 
> Doesn't phys_mem_access_prot() handle that for us? Or do I have my
> _noncached/_cached logic inverted?

The kernel permission policy is implemented via phys_mem_access_prot().

pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
                              unsigned long size, pgprot_t vma_prot)
{
        if (ppc_md.phys_mem_access_prot)
                return ppc_md.phys_mem_access_prot(file, pfn, size, vma_prot);

        if (!page_is_ram(pfn))
                vma_prot = pgprot_noncached(vma_prot);

        return vma_prot;
}

The problem for firmware (HBRT) mapping is precisely this policy that
forces a different defaults pgprot_noncached() if the memory page is
not in system-RAM.

If !page_is_ram(pfn) check is how defaults will change.  This check is
what changes pgprot for us now for NVRAM.  We can override this
permission in opal-prd driver since we know the nature of the use
case. But I would expect OPAL to indicate that we should override and
then proceed to do it in the driver.

If we really architect a usage model where firmware uses NVRAM, then
it should be in its own namespace and not really pick just an address.
Also as per our opal-prd design, other than firmware memory, all
device access should come via host-kernel driver and not directly
mapped to application.

--Vaidy

Reply via email to