On Mon, Apr 18, 2022 at 09:42:00AM -0700, Dan Williams wrote:
> [ add the usual HMM suspects Christoph, Jason, and John ]
> 
> On Wed, Apr 13, 2022 at 11:38 AM Ben Widawsky <[email protected]> wrote:
> >
> > Define an API which allows CXL drivers to manage CXL address space.
> > CXL is unique in that the address space and various properties are only
> > known after CXL drivers come up, and therefore cannot be part of core
> > memory enumeration.
> 
> I think this buries the lead on the problem introduced by
> MEMORY_DEVICE_PRIVATE in the first place. Let's revisit that history
> before diving into what CXL needs.
> 
> 
> Commit 4ef589dc9b10 ("mm/hmm/devmem: device memory hotplug using
> ZONE_DEVICE") introduced the concept of MEMORY_DEVICE_PRIVATE. At its
> core MEMORY_DEVICE_PRIVATE uses the ZONE_DEVICE capability to annotate
> an "unused" physical address range with 'struct page' for the purpose
> of coordinating migration of buffers onto and off of a GPU /
> accelerator. The determination of "unused" was based on a heuristic,
> not a guarantee, that any address range not expressly conveyed in the
> platform firmware map of the system can be repurposed for software
> use. The CXL Fixed Memory Windows Structure  (CFMWS) definition
> explicitly breaks the assumptions of that heuristic.

So CXL defines an address map that is not part of the FW list?

> > It would be desirable to simply insert this address space into
> > iomem_resource with a new flag to denote this is CXL memory. This would
> > permit request_free_mem_region() to be reused for CXL memory provided it
> > learned some new tricks. For that, it is tempting to simply use
> > insert_resource(). The API was designed specifically for cases where new
> > devices may offer new address space. This cannot work in the general
> > case. Boot firmware can pass, some, none, or all of the CFMWS range as
> > various types of memory to the kernel, and this may be left alone,
> > merged, or even expanded.

And then we understand that on CXL the FW is might pass stuff that
intersects with the actual CXL ranges?

> > As a result iomem_resource may intersect CFMWS
> > regions in ways insert_resource cannot handle [2]. Similar reasoning
> > applies to allocate_resource().
> >
> > With the insert_resource option out, the only reasonable approach left
> > is to let the CXL driver manage the address space independently of
> > iomem_resource and attempt to prevent users of device private memory

And finally due to all these FW problems we are going to make a 2nd
allocator for physical address space and just disable the normal one?

Then since DEVICE_PRIVATE is a notable user of this allocator we now
understand it becomes broken?

Sounds horrible. IMHO you should fix the normal allocator somehow to
understand that the ranges from FW have been reprogrammed by Linux and
not try to build a whole different allocator in CXL code.

Jason

Reply via email to