Jonathan Cameron wrote:
> On Thu, 23 Jun 2022 19:47:18 -0700
> Dan Williams <[email protected]> wrote:
> 
> > The region provisioning flow will roughly follow a sequence of:
> > 
> > 1/ Allocate DPA to a set of decoders
> > 
> > 2/ Allocate HPA to a region
> > 
> > 3/ Associate decoders with a region and validate that the DPA allocations
> >    and topologies match the parameters of the region.
> > 
> > For now, this change (step 1) arranges for DPA capacity to be allocated
> > and deleted from non-committed decoders based on the decoder's mode /
> > partition selection. Capacity is allocated from the lowest DPA in the
> > partition and any 'pmem' allocation blocks out all remaining ram
> > capacity in its 'skip' setting. DPA allocations are enforced in decoder
> > instance order. I.e. decoder N + 1 always starts at a higher DPA than
> > instance N, and deleting allocations must proceed from the
> > highest-instance allocated decoder to the lowest.
> > 
> > Signed-off-by: Dan Williams <[email protected]>
> 
> The error value setting in here might save a few lines, but to me it
> is less readable than setting rc in each error path.
> 
> > ---
> >  Documentation/ABI/testing/sysfs-bus-cxl |   37 +++++++
> >  drivers/cxl/core/core.h                 |    7 +
> >  drivers/cxl/core/hdm.c                  |  160 
> > +++++++++++++++++++++++++++++++
> >  drivers/cxl/core/port.c                 |   73 ++++++++++++++
> >  4 files changed, 275 insertions(+), 2 deletions(-)
> > 
> > diff --git a/Documentation/ABI/testing/sysfs-bus-cxl 
> > b/Documentation/ABI/testing/sysfs-bus-cxl
> > index 091459216e11..85844f9bc00b 100644
> > --- a/Documentation/ABI/testing/sysfs-bus-cxl
> > +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> > @@ -171,7 +171,7 @@ Date:           May, 2022
> >  KernelVersion:     v5.20
> >  Contact:   [email protected]
> >  Description:
> > -           (RO) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
> > +           (RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
> >             translates from a host physical address range, to a device local
> >             address range. Device-local address ranges are further split
> >             into a 'ram' (volatile memory) range and 'pmem' (persistent
> > @@ -180,3 +180,38 @@ Description:
> >             when a decoder straddles the volatile/persistent partition
> >             boundary, and 'none' indicates the decoder is not actively
> >             decoding, or no DPA allocation policy has been set.
> > +
> > +           'mode' can be written, when the decoder is in the 'disabled'
> > +           state, with either 'ram' or 'pmem' to set the boundaries for the
> > +           next allocation.
> > +
> 
> As before, documentation above this in the file only uses single line break 
> between
> entries.

Yeah, I'll go fix those up as a precursor patch.

> 
> > +
> > +What:              /sys/bus/cxl/devices/decoderX.Y/dpa_resource
> > +Date:              May, 2022
> > +KernelVersion:     v5.20
> > +Contact:   [email protected]
> > +Description:
> > +           (RO) When a CXL decoder is of devtype "cxl_decoder_endpoint",
> > +           and its 'dpa_size' attribute is non-zero, this attribute
> > +           indicates the device physical address (DPA) base address of the
> > +           allocation.
> 
> Why _resource rather than _base or _start?

To mimic PCI and NVDIMM sysfs that calls it a 'resource' address.

> 
> > +
> > +
> > +What:              /sys/bus/cxl/devices/decoderX.Y/dpa_size
> > +Date:              May, 2022
> > +KernelVersion:     v5.20
> > +Contact:   [email protected]
> > +Description:
> > +           (RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
> > +           translates from a host physical address range, to a device local
> > +           address range. The range, base address plus length in bytes, of
> > +           DPA allocated to this decoder is conveyed in these 2 attributes.
> > +           Allocations can be mutated as long as the decoder is in the
> > +           disabled state. A write to 'size' releases the previous DPA
> 
> 'dpa_size' ?

Yes.

> 
> > +           allocation and then attempts to allocate from the free capacity
> > +           in the device partition referred to by 'decoderX.Y/mode'.
> > +           Allocate and free requests can only be performed on the highest
> > +           instance number disabled decoder with non-zero size. I.e.
> > +           allocations are enforced to occur in increasing 'decoderX.Y/id'
> > +           order and frees are enforced to occur in decreasing
> > +           'decoderX.Y/id' order.
> > diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> > index 1a50c0fc399c..47cf0c286fc3 100644
> > --- a/drivers/cxl/core/core.h
> > +++ b/drivers/cxl/core/core.h
> > @@ -17,6 +17,13 @@ int cxl_send_cmd(struct cxl_memdev *cxlmd, struct 
> > cxl_send_command __user *s);
> >  void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t 
> > addr,
> >                                resource_size_t length);
> >  
> > +int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
> > +                enum cxl_decoder_mode mode);
> > +int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long 
> > size);
> > +int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
> > +resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled);
> > +resource_size_t cxl_dpa_resource(struct cxl_endpoint_decoder *cxled);
> > +
> >  int cxl_memdev_init(void);
> >  void cxl_memdev_exit(void);
> >  void cxl_mbox_init(void);
> > diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> > index 8805afe63ebf..ceb4c28abc1b 100644
> > --- a/drivers/cxl/core/hdm.c
> > +++ b/drivers/cxl/core/hdm.c
> > @@ -248,6 +248,166 @@ static int cxl_dpa_reserve(struct 
> > cxl_endpoint_decoder *cxled,
> >     return devm_add_action_or_reset(&port->dev, cxl_dpa_release, cxled);
> >  }
> >  
> > +resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled)
> > +{
> > +   resource_size_t size = 0;
> > +
> > +   down_read(&cxl_dpa_rwsem);
> > +   if (cxled->dpa_res)
> > +           size = resource_size(cxled->dpa_res);
> > +   up_read(&cxl_dpa_rwsem);
> > +
> > +   return size;
> > +}
> > +
> > +resource_size_t cxl_dpa_resource(struct cxl_endpoint_decoder *cxled)
> 
> Instinct would be to expect this to return the resource, not the start.
> Rename?

I can add _start, but this is only servicing the dpa_resource_show()
sysfs operation, and there 'resource' as a the base address has ABI
history.

> 
> 
> > +{
> > +   resource_size_t base = -1;
> > +
> > +   down_read(&cxl_dpa_rwsem);
> > +   if (cxled->dpa_res)
> > +           base = cxled->dpa_res->start;
> > +   up_read(&cxl_dpa_rwsem);
> > +
> > +   return base;
> > +}
> > +
> > +int cxl_dpa_free(struct cxl_endpoint_decoder *cxled)
> > +{
> > +   int rc = -EBUSY;
> > +   struct device *dev = &cxled->cxld.dev;
> > +   struct cxl_port *port = to_cxl_port(dev->parent);
> > +
> > +   down_write(&cxl_dpa_rwsem);
> > +   if (!cxled->dpa_res) {
> > +           rc = 0;
> > +           goto out;
> > +   }
> > +   if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
> > +           dev_dbg(dev, "decoder enabled\n");
> 
> I'd prefer explicit setting of rc = -EBUSY in the two
> 'error' paths to make it really clear when looking at these
> that they are treated as errors.

Ok.

> 
> > +           goto out;
> > +   }
> > +   if (cxled->cxld.id != port->dpa_end) {
> > +           dev_dbg(dev, "expected decoder%d.%d\n", port->id,
> > +                   port->dpa_end);
> > +           goto out;
> > +   }
> > +   __cxl_dpa_release(cxled, true);
> > +   rc = 0;
> > +out:
> > +   up_write(&cxl_dpa_rwsem);
> > +   return rc;
> > +}
> > +
> > +int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
> > +                enum cxl_decoder_mode mode)
> > +{
> > +   struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> > +   struct cxl_dev_state *cxlds = cxlmd->cxlds;
> > +   struct device *dev = &cxled->cxld.dev;
> > +   int rc = -EBUSY;
> 
> As above, I'd prefer seeing error set in each error path rther
> than it being set in a few locations and having to go look
> for which value it currently has.  To me having the
> error code next to the condition is much easier to follow.

Fair enough.

> 
> > +
> > +   switch (mode) {
> > +   case CXL_DECODER_RAM:
> > +   case CXL_DECODER_PMEM:
> > +           break;
> > +   default:
> > +           dev_dbg(dev, "unsupported mode: %d\n", mode);
> > +           return -EINVAL;
> > +   }
> > +
> > +   down_write(&cxl_dpa_rwsem);
> > +   if (cxled->cxld.flags & CXL_DECODER_F_ENABLE)
> > +           goto out;
> > +   /*
> > +    * Only allow modes that are supported by the current partition
> > +    * configuration
> > +    */
> > +   rc = -ENXIO;
> > +   if (mode == CXL_DECODER_PMEM && !resource_size(&cxlds->pmem_res)) {
> > +           dev_dbg(dev, "no available pmem capacity\n");
> > +           goto out;
> > +   }
> > +   if (mode == CXL_DECODER_RAM && !resource_size(&cxlds->ram_res)) {
> > +           dev_dbg(dev, "no available ram capacity\n");
> > +           goto out;
> > +   }
> > +
> > +   cxled->mode = mode;
> > +   rc = 0;
> > +out:
> > +   up_write(&cxl_dpa_rwsem);
> > +
> > +   return rc;
> > +}
> > +
> > +int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long 
> > size)
> > +{
> > +   struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> > +   resource_size_t free_ram_start, free_pmem_start;
> > +   struct cxl_port *port = cxled_to_port(cxled);
> > +   struct cxl_dev_state *cxlds = cxlmd->cxlds;
> > +   struct device *dev = &cxled->cxld.dev;
> > +   resource_size_t start, avail, skip;
> > +   struct resource *p, *last;
> > +   int rc = -EBUSY;
> > +
> > +   down_write(&cxl_dpa_rwsem);
> > +   if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
> > +           dev_dbg(dev, "decoder enabled\n");
> > +           goto out;
> 
> 
> -EBUSY only used in this path, so clearer to me to push that setting down
> to in this  error path.

Ok.

Folded in these changes:

diff --git a/Documentation/ABI/testing/sysfs-bus-cxl 
b/Documentation/ABI/testing/sysfs-bus-cxl
index 85844f9bc00b..3fa6da73751e 100644
--- a/Documentation/ABI/testing/sysfs-bus-cxl
+++ b/Documentation/ABI/testing/sysfs-bus-cxl
@@ -207,7 +207,7 @@ Description:
                address range. The range, base address plus length in bytes, of
                DPA allocated to this decoder is conveyed in these 2 attributes.
                Allocations can be mutated as long as the decoder is in the
-               disabled state. A write to 'size' releases the previous DPA
+               disabled state. A write to 'dpa_size' releases the previous DPA
                allocation and then attempts to allocate from the free capacity
                in the device partition referred to by 'decoderX.Y/mode'.
                Allocate and free requests can only be performed on the highest
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 47cf0c286fc3..65bcaecec405 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -22,7 +22,7 @@ int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
 int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size);
 int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
 resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled);
-resource_size_t cxl_dpa_resource(struct cxl_endpoint_decoder *cxled);
+resource_size_t cxl_dpa_resource_start(struct cxl_endpoint_decoder *cxled);
 
 int cxl_memdev_init(void);
 void cxl_memdev_exit(void);
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 0c1ff3c0142f..e9281557781d 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -270,7 +270,7 @@ resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder 
*cxled)
        return size;
 }
 
-resource_size_t cxl_dpa_resource(struct cxl_endpoint_decoder *cxled)
+resource_size_t cxl_dpa_resource_start(struct cxl_endpoint_decoder *cxled)
 {
        resource_size_t base = -1;
 
@@ -284,9 +284,9 @@ resource_size_t cxl_dpa_resource(struct 
cxl_endpoint_decoder *cxled)
 
 int cxl_dpa_free(struct cxl_endpoint_decoder *cxled)
 {
-       int rc = -EBUSY;
+       struct cxl_port *port = cxled_to_port(cxled);
        struct device *dev = &cxled->cxld.dev;
-       struct cxl_port *port = to_cxl_port(dev->parent);
+       int rc;
 
        down_write(&cxl_dpa_rwsem);
        if (!cxled->dpa_res) {
@@ -295,11 +295,13 @@ int cxl_dpa_free(struct cxl_endpoint_decoder *cxled)
        }
        if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
                dev_dbg(dev, "decoder enabled\n");
+               rc = -EBUSY;
                goto out;
        }
-       if (cxled->cxld.id != port->dpa_end) {
+       if (cxled->cxld.id != port->hdm_end) {
                dev_dbg(dev, "expected decoder%d.%d\n", port->id,
-                       port->dpa_end);
+                       port->hdm_end);
+               rc = -EBUSY;
                goto out;
        }
        devm_cxl_dpa_release(cxled);
@@ -315,7 +317,7 @@ int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
        struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
        struct cxl_dev_state *cxlds = cxlmd->cxlds;
        struct device *dev = &cxled->cxld.dev;
-       int rc = -EBUSY;
+       int rc;
 
        switch (mode) {
        case CXL_DECODER_RAM:
@@ -327,19 +329,23 @@ int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
        }
 
        down_write(&cxl_dpa_rwsem);
-       if (cxled->cxld.flags & CXL_DECODER_F_ENABLE)
+       if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
+               rc = -EBUSY;
                goto out;
+       }
+
        /*
         * Only allow modes that are supported by the current partition
         * configuration
         */
-       rc = -ENXIO;
        if (mode == CXL_DECODER_PMEM && !resource_size(&cxlds->pmem_res)) {
                dev_dbg(dev, "no available pmem capacity\n");
+               rc = -ENXIO;
                goto out;
        }
        if (mode == CXL_DECODER_RAM && !resource_size(&cxlds->ram_res)) {
                dev_dbg(dev, "no available ram capacity\n");
+               rc = -ENXIO;
                goto out;
        }
 
@@ -360,11 +366,12 @@ int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, 
unsigned long long size)
        struct device *dev = &cxled->cxld.dev;
        resource_size_t start, avail, skip;
        struct resource *p, *last;
-       int rc = -EBUSY;
+       int rc;
 
        down_write(&cxl_dpa_rwsem);
        if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
                dev_dbg(dev, "decoder enabled\n");
+               rc = -EBUSY;
                goto out;
        }
 
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 130989846cce..feed86737202 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -215,7 +215,7 @@ static ssize_t dpa_resource_show(struct device *dev, struct 
device_attribute *at
                            char *buf)
 {
        struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(dev);
-       u64 base = cxl_dpa_resource(cxled);
+       u64 base = cxl_dpa_resource_start(cxled);
 
        return sysfs_emit(buf, "%#llx\n", base);
 }

Reply via email to