On Tue, 29 Mar 2022 12:53:51 -0700
Davidlohr Bueso <d...@stgolabs.net> wrote:

> On Tue, 29 Mar 2022, Adam Manzanares wrote:
> >> +typedef struct cxl_device_state {
> >> +    MemoryRegion device_registers;
> >> +
> >> +    /* mmio for device capabilities array - 8.2.8.2 */
> >> +    MemoryRegion device;
> >> +    MemoryRegion caps;
> >> +
> >> +    /* mmio for the mailbox registers 8.2.8.4 */
> >> +    MemoryRegion mailbox;
> >> +
> >> +    /* memory region for persistent memory, HDM */
> >> +    uint64_t pmem_size;  
> >
> >Can we switch this to mem_size and drop the persistent comment? It is my
> >understanding that HDM is independent of persistence.  
> 
> Agreed, but ideally both volatile and persistent capacities would have been
> supported in this patchset. I'm also probably missing specific reasons as to
> why this isn't the case.

Whilst it doesn't add a huge amount of complexity it does add some
and the software paths in Linux we were developing this for are pmem focused.
Hence volatile is on the todo list rather than in this first patch set.
Not sensible to aim for feature complete in one go.

> 
> Looking at it briefly could it be just a matter of adding to cxl_type3_dev
> a new hostmem along with it's AddressSpace for the volatile? If so, I'm
> thinking something along these lines:
> 
> @@ -123,8 +123,8 @@ typedef struct cxl_device_state {
>        uint64_t host_set;
>       } timestamp;
> 
> -    /* memory region for persistent memory, HDM */
> -    uint64_t pmem_size;
> +    /* memory region for persistent and volatile memory, HDM */
> +    uint64_t pmem_size, mem_size;
>   } CXLDeviceState;
> 
>   /* Initialize the register block for a device */
> @@ -235,9 +235,9 @@ typedef struct cxl_type3_dev {
>       PCIDevice parent_obj;
> 
>       /* Properties */
> -    AddressSpace hostmem_as;
> +    AddressSpace hostmem_as, hostmemv_as;
>       uint64_t size;
> -    HostMemoryBackend *hostmem;
> +    HostMemoryBackend *hostmem, *hostmemv;
>       HostMemoryBackend *lsa;
>       uint64_t sn;
> 
> Then for cxl_setup_memory(), with ct3d->hostmem and/or ct3d->hostmemv
> non-nil, set the respective MemoryRegions:
> 
> +    if (ct3d->hostmem) {
> +            memory_region_set_nonvolatile(mr, true);
> +            memory_region_set_enabled(mr, true);
> +            host_memory_backend_set_mapped(ct3d->hostmem, true);
> +            address_space_init(&ct3d->hostmem_as, mr, name);
> +            ct3d->cxl_dstate.pmem_size = ct3d->hostmem->size;
> +    }
> +    if (ct3d->hostmemv) {
> +            memory_region_set_nonvolatile(mrv, false);
> +            memory_region_set_enabled(mrv, true);
> +            host_memory_backend_set_mapped(ct3d->hostmemv, true);
> +            address_space_init(&ct3d->hostmem_as, mrv, name);
> +            ct3d->cxl_dstate.pmem_size = ct3d->hostmem->size;
> +    }
> 
> For corresponding MB commands, it's mostly IDENTIFY_MEMORY_DEVICE that needs
> updating:
> 
> @@ -281,7 +281,7 @@ static ret_code cmd_identify_memory_device(struct cxl_cmd 
> *cmd,
> 
>       CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
>       CXLType3Class *cvc = CXL_TYPE3_DEV_GET_CLASS(ct3d);
> -    uint64_t size = cxl_dstate->pmem_size;
> +    uint64_t size = cxl_dstate->pmem_size + cxl_dstate->mem_size;
> 
>       if (!QEMU_IS_ALIGNED(size, 256 << 20)) {
>        return CXL_MBOX_INTERNAL_ERROR;
> @@ -290,11 +290,11 @@ static ret_code cmd_identify_memory_device(struct 
> cxl_cmd *cmd,
>       id = (void *)cmd->payload;
>       memset(id, 0, sizeof(*id));
> 
> -    /* PMEM only */
>       snprintf(id->fw_revision, 0x10, "BWFW VERSION %02d", 0);
> 
>       id->total_capacity = size / (256 << 20);
> -    id->persistent_capacity = size / (256 << 20);
> +    id->persistent_capacity = cxl_dstate->pmem_size / (256 << 20);
> +    id->volatile_capacity = cxl_dstate->mem_size / (256 << 20);
>       id->lsa_size = cvc->get_lsa_size(ct3d);
> 
>       *len = sizeof(*id);
> @@ -312,16 +312,16 @@ static ret_code cmd_ccls_get_partition_info(struct 
> cxl_cmd *cmd,
>        uint64_t next_pmem;
>       } QEMU_PACKED *part_info = (void *)cmd->payload;
>       QEMU_BUILD_BUG_ON(sizeof(*part_info) != 0x20);
> -    uint64_t size = cxl_dstate->pmem_size;
> +    uint64_t psize = cxl_dstate->pmem_size;
> +    uint64_t vsize = cxl_dstate->mem_size;
> 
> -    if (!QEMU_IS_ALIGNED(size, 256 << 20)) {
> +    if (!QEMU_IS_ALIGNED(psize + vsize, 256 << 20)) {
>        return CXL_MBOX_INTERNAL_ERROR;
>       }
> 
> -    /* PMEM only */
> -    part_info->active_vmem = 0;
> -    part_info->next_vmem = 0;
> -    part_info->active_pmem = size / (256 << 20);
> +    part_info->active_vmem = vsize / (256 << 20);
> +    part_info->next_vmem = part_info->active_vmem;
> +    part_info->active_pmem = psize / (256 << 20);
>       part_info->next_pmem = part_info->active_pmem;
> 
> Then for reads/writes, both cxl_type3_write() and _read() would, after 
> computing the dpa_offset,
> first try the volatile region then upon error attempt the same in the 
> persistent memory - this
> assuming the DPA space is consistent among both types of memory. Ie:
> 
> address_space_read(&ct3d->hostmemv_as, dpa_offset, attrs, data, size)
> or
> address_space_read(&ct3d->hostmem_as, dpa_offset, attrs, data, size)
> 
> ... but then again all this is probably just wishful thinking.

Without looking in detail, will indeed be something along those lines.
Gets more fiddly if you include partitioning control that Alison was interested
in adding.

Also, we probably need to support multiple HDM decoders.  Also not a huge
complexity to add, but left for now as main focus is to get the base
patch set merged.

So I'm happy to queue stuff up on top of this series and carry it forward
but I don't want to add features to what we try to merge initially.
This set is already huge and hard to review even with what think is a
minimum set of features to be useful.

Note I'm already carrying a few features on top if this on the gitlab
branch gitlab.com/jic23/qemu (DOE + CDAT and serial numbers) and
have a few other things out of tree for now (SPDM, emulating most
of the PCI Config space controls). 

Jonathan

> 
> Thanks,
> Davidlohr


Reply via email to