Re: [PATCH] cxl: don't manipulate the mm.mm_users field directly
On 10/03/2021 18:44, Laurent Dufour wrote: It is better to rely on the API provided by the MM layer instead of directly manipulating the mm_users field. Signed-off-by: Laurent Dufour --- Thanks! Acked-by: Frederic Barrat drivers/misc/cxl/fault.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/cxl/fault.c b/drivers/misc/cxl/fault.c index 01153b74334a..60c829113299 100644 --- a/drivers/misc/cxl/fault.c +++ b/drivers/misc/cxl/fault.c @@ -200,7 +200,7 @@ static struct mm_struct *get_mem_context(struct cxl_context *ctx) if (ctx->mm == NULL) return NULL; - if (!atomic_inc_not_zero(>mm->mm_users)) + if (!mmget_not_zero(ctx->mm)) return NULL; return ctx->mm;
Re: [PATCH v2 -next] misc: ocxl: use DEFINE_MUTEX() for mutex lock
On 24/12/2020 14:24, Zheng Yongjun wrote: mutex lock can be initialized automatically with DEFINE_MUTEX() rather than explicitly calling mutex_init(). Signed-off-by: Zheng Yongjun --- Thanks! Acked-by: Frederic Barrat drivers/misc/ocxl/file.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index 4d1b44de1492..e70525eedaae 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -15,7 +15,7 @@ static dev_t ocxl_dev; static struct class *ocxl_class; -static struct mutex minors_idr_lock; +static DEFINE_MUTEX(minors_idr_lock); static struct idr minors_idr; static struct ocxl_file_info *find_and_get_file_info(dev_t devno) @@ -588,7 +588,6 @@ int ocxl_file_init(void) { int rc; - mutex_init(_idr_lock); idr_init(_idr); rc = alloc_chrdev_region(_dev, 0, OCXL_NUM_MINORS, "ocxl");
Re: [PATCH kernel v3] genirq/irqdomain: Add reference counting to IRQs
On 14/11/2020 04:37, Alexey Kardashevskiy wrote: I'll try to go through this patch over the week-end (or more probably early next week), and try to understand where our understandings differ. Great, thanks! Fred spotted a problem with irq_free_descs() not doing kobject_put() anymore and this is a problem for sa.c and the likes and I will go though these places anyway. So there are callers out there which don't care about mapping the interrupt. Wouldn't it be easier to leave alone the kobject from the irq descriptor (my understanding is that it's there to handle the sysfs representation) and add a simple kref counter, just to handle the mapping part? Fred
Re: [PATCH v2 2/2] misc: ocxl: config: Rename function attribute description
Le 02/11/2020 à 15:20, Lee Jones a écrit : Fixes the following W=1 kernel build warning(s): drivers/misc/ocxl/config.c:81: warning: Function parameter or member 'dev' not described in 'get_function_0' drivers/misc/ocxl/config.c:81: warning: Excess function parameter 'device' description in 'get_function_0' Cc: Frederic Barrat Cc: Andrew Donnellan Cc: Arnd Bergmann Cc: Greg Kroah-Hartman Cc: linuxppc-...@lists.ozlabs.org Signed-off-by: Lee Jones --- Thanks! Acked-by: Frederic Barrat drivers/misc/ocxl/config.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c index 4d490b92d951f..a68738f382521 100644 --- a/drivers/misc/ocxl/config.c +++ b/drivers/misc/ocxl/config.c @@ -73,7 +73,7 @@ static int find_dvsec_afu_ctrl(struct pci_dev *dev, u8 afu_idx) /** * get_function_0() - Find a related PCI device (function 0) - * @device: PCI device to match + * @dev: PCI device to match * * Returns a pointer to the related device, or null if not found */
Re: [PATCH v2 20/39] docs: ABI: testing: make the files compatible with ReST output
Le 30/10/2020 à 08:40, Mauro Carvalho Chehab a écrit : Some files over there won't parse well by Sphinx. Fix them. Acked-by: Jonathan Cameron # for IIO Signed-off-by: Mauro Carvalho Chehab --- ... Documentation/ABI/testing/sysfs-class-cxl | 15 +- ... Documentation/ABI/testing/sysfs-class-ocxl| 3 + Patches 20, 28 and 31 look good for cxl and ocxl. Acked-by: Frederic Barrat Fred
Re: [PATCH -next] ocxl: simplify the return expression of free_function_dev()
Le 21/09/2020 à 15:10, Qinglang Miao a écrit : Simplify the return expression. Signed-off-by: Qinglang Miao --- Thanks! Acked-by: Frederic Barrat drivers/misc/ocxl/core.c | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/drivers/misc/ocxl/core.c b/drivers/misc/ocxl/core.c index b7a09b21a..aebfc53a2 100644 --- a/drivers/misc/ocxl/core.c +++ b/drivers/misc/ocxl/core.c @@ -327,14 +327,9 @@ static void free_function_dev(struct device *dev) static int set_function_device(struct ocxl_fn *fn, struct pci_dev *dev) { - int rc; - fn->dev.parent = >dev; fn->dev.release = free_function_dev; - rc = dev_set_name(>dev, "ocxlfn.%s", dev_name(>dev)); - if (rc) - return rc; - return 0; + return dev_set_name(>dev, "ocxlfn.%s", dev_name(>dev)); } static int assign_function_actag(struct ocxl_fn *fn)
Re: [PATCH AUTOSEL 5.4 101/330] powerpc/powernv/ioda: Fix ref count for devices with their own PE
Le 19/09/2020 à 20:10, Sasha Levin a écrit : On Fri, Sep 18, 2020 at 08:35:06AM +0200, Frederic Barrat wrote: Le 18/09/2020 à 03:57, Sasha Levin a écrit : From: Frederic Barrat [ Upstream commit 05dd7da76986937fb288b4213b1fa10dbe0d1b33 ] This patch is not desirable for stable, for 5.4 and 4.19 (it was already flagged by autosel back in April. Not sure why it's showing again now) Hey Fred, This was a bit of a "lie", it wasn't a run of AUTOSEL, but rather an audit of patches that went into distro/vendor trees but not into the upstream stable trees. I can see that this patch was pulled into Ubuntu's 5.4 tree, is it not needed in the upstream stable tree? That patch in itself is useless (it replaces a ref counter leak by another one). It was part of a longer series that we backported to Ubuntu's 5.4 tree. So it's really not needed on the stable trees. It likely wouldn't hurt or break anything, but there's really no point. Fred
Re: [PATCH] ocxl: fix kconfig dependency warning for OCXL
Le 18/09/2020 à 11:41, Necip Fazil Yildiran a écrit : When OCXL is enabled and HOTPLUG_PCI is disabled, it results in the following Kbuild warning: WARNING: unmet direct dependencies detected for HOTPLUG_PCI_POWERNV Depends on [n]: PCI [=y] && HOTPLUG_PCI [=n] && PPC_POWERNV [=y] && EEH [=y] Selected by [y]: - OCXL [=y] && PPC_POWERNV [=y] && PCI [=y] && EEH [=y] The reason is that OCXL selects HOTPLUG_PCI_POWERNV without depending on or selecting HOTPLUG_PCI while HOTPLUG_PCI_POWERNV is subordinate to HOTPLUG_PCI. HOTPLUG_PCI_POWERNV is a visible symbol with a set of dependencies. Selecting it will lead to overlooking its other dependencies as well. Let OCXL depend on HOTPLUG_PCI_POWERNV instead to avoid Kbuild issues. Fixes: 49ce94b8677c ("ocxl: Add PCI hotplug dependency to Kconfig") Signed-off-by: Necip Fazil Yildiran --- OK, that makes sense, thanks! Acked-by: Frederic Barrat drivers/misc/ocxl/Kconfig | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/misc/ocxl/Kconfig b/drivers/misc/ocxl/Kconfig index 6551007a066c..947294f6d7f4 100644 --- a/drivers/misc/ocxl/Kconfig +++ b/drivers/misc/ocxl/Kconfig @@ -9,9 +9,8 @@ config OCXL_BASE config OCXL tristate "OpenCAPI coherent accelerator support" - depends on PPC_POWERNV && PCI && EEH + depends on PPC_POWERNV && PCI && EEH && HOTPLUG_PCI_POWERNV select OCXL_BASE - select HOTPLUG_PCI_POWERNV default m help Select this option to enable the ocxl driver for Open
Re: [PATCH AUTOSEL 5.4 101/330] powerpc/powernv/ioda: Fix ref count for devices with their own PE
Le 18/09/2020 à 03:57, Sasha Levin a écrit : From: Frederic Barrat [ Upstream commit 05dd7da76986937fb288b4213b1fa10dbe0d1b33 ] This patch is not desirable for stable, for 5.4 and 4.19 (it was already flagged by autosel back in April. Not sure why it's showing again now) Fred The pci_dn structure used to store a pointer to the struct pci_dev, so taking a reference on the device was required. However, the pci_dev pointer was later removed from the pci_dn structure, but the reference was kept for the npu device. See commit 902bdc57451c ("powerpc/powernv/idoa: Remove unnecessary pcidev from pci_dn"). We don't need to take a reference on the device when assigning the PE as the struct pnv_ioda_pe is cleaned up at the same time as the (physical) device is released. Doing so prevents the device from being released, which is a problem for opencapi devices, since we want to be able to remove them through PCI hotplug. Now the ugly part: nvlink npu devices are not meant to be released. Because of the above, we've always leaked a reference and simply removing it now is dangerous and would likely require more work. There's currently no release device callback for nvlink devices for example. So to be safe, this patch leaks a reference on the npu device, but only for nvlink and not opencapi. Signed-off-by: Frederic Barrat Reviewed-by: Andrew Donnellan Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20191121134918.7155-2-fbar...@linux.ibm.com Signed-off-by: Sasha Levin --- arch/powerpc/platforms/powernv/pci-ioda.c | 19 --- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 058223233088e..e9cda7e316a50 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -1062,14 +1062,13 @@ static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev) return NULL; } - /* NOTE: We get only one ref to the pci_dev for the pdn, not for the -* pointer in the PE data structure, both should be destroyed at the -* same time. However, this needs to be looked at more closely again -* once we actually start removing things (Hotplug, SR-IOV, ...) + /* NOTE: We don't get a reference for the pointer in the PE +* data structure, both the device and PE structures should be +* destroyed at the same time. However, removing nvlink +* devices will need some work. * * At some point we want to remove the PDN completely anyways */ - pci_dev_get(dev); pdn->pe_number = pe->pe_number; pe->flags = PNV_IODA_PE_DEV; pe->pdev = dev; @@ -1084,7 +1083,6 @@ static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev) pnv_ioda_free_pe(pe); pdn->pe_number = IODA_INVALID_PE; pe->pdev = NULL; - pci_dev_put(dev); return NULL; } @@ -1205,6 +1203,14 @@ static struct pnv_ioda_pe *pnv_ioda_setup_npu_PE(struct pci_dev *npu_pdev) struct pci_controller *hose = pci_bus_to_host(npu_pdev->bus); struct pnv_phb *phb = hose->private_data; + /* +* Intentionally leak a reference on the npu device (for +* nvlink only; this is not an opencapi path) to make sure it +* never goes away, as it's been the case all along and some +* work is needed otherwise. +*/ + pci_dev_get(npu_pdev); + /* * Due to a hardware errata PE#0 on the NPU is reserved for * error handling. This means we only have three PEs remaining @@ -1228,7 +1234,6 @@ static struct pnv_ioda_pe *pnv_ioda_setup_npu_PE(struct pci_dev *npu_pdev) */ dev_info(_pdev->dev, "Associating to existing PE %x\n", pe_num); - pci_dev_get(npu_pdev); npu_pdn = pci_get_pdn(npu_pdev); rid = npu_pdev->bus->number << 8 | npu_pdn->devfn; npu_pdn->pe_number = pe_num;
Re: [PATCH v2 02/12] ocxl: Change type of pasid to unsigned int
Le 18/06/2020 à 17:37, Fenghua Yu a écrit : The first 3 patches clean up pasid and flag defitions to prepare for following patches. If you think this patch can be dropped, we will drop it. Yes, I think that's the case. Thanks, Fred
Re: [PATCH v2 02/12] ocxl: Change type of pasid to unsigned int
Le 13/06/2020 à 02:41, Fenghua Yu a écrit : PASID is defined as "int" although it's a 20-bit value and shouldn't be negative int. To be consistent with type defined in iommu, define PASID as "unsigned int". It looks like this patch was considered because of the use of 'pasid' in variable or function names. The ocxl driver only makes sense on powerpc and shouldn't compile on anything else, so it's probably useless in the context of that series. The pasid here is defined by the opencapi specification (https://opencapi.org), it is borrowed from the PCI world and you could argue it could be an unsigned int. But then I think the patch doesn't go far enough. But considering it's not used on x86, I think this patch can be dropped. Fred Suggested-by: Thomas Gleixner Signed-off-by: Fenghua Yu Reviewed-by: Tony Luck --- v2: - Create this new patch to define PASID as "unsigned int" consistently in ocxl (Thomas) drivers/misc/ocxl/config.c| 3 ++- drivers/misc/ocxl/link.c | 6 +++--- drivers/misc/ocxl/ocxl_internal.h | 6 +++--- drivers/misc/ocxl/pasid.c | 2 +- drivers/misc/ocxl/trace.h | 20 ++-- include/misc/ocxl.h | 6 +++--- 6 files changed, 22 insertions(+), 21 deletions(-) diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c index c8e19bfb5ef9..22d034caed3d 100644 --- a/drivers/misc/ocxl/config.c +++ b/drivers/misc/ocxl/config.c @@ -806,7 +806,8 @@ int ocxl_config_set_TL(struct pci_dev *dev, int tl_dvsec) } EXPORT_SYMBOL_GPL(ocxl_config_set_TL); -int ocxl_config_terminate_pasid(struct pci_dev *dev, int afu_control, int pasid) +int ocxl_config_terminate_pasid(struct pci_dev *dev, int afu_control, + unsigned int pasid) { u32 val; unsigned long timeout; diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c index 58d111afd9f6..931f6ae022db 100644 --- a/drivers/misc/ocxl/link.c +++ b/drivers/misc/ocxl/link.c @@ -492,7 +492,7 @@ static u64 calculate_cfg_state(bool kernel) return state; } -int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr, +int ocxl_link_add_pe(void *link_handle, unsigned int pasid, u32 pidr, u32 tidr, u64 amr, struct mm_struct *mm, void (*xsl_err_cb)(void *data, u64 addr, u64 dsisr), void *xsl_err_data) @@ -572,7 +572,7 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr, } EXPORT_SYMBOL_GPL(ocxl_link_add_pe); -int ocxl_link_update_pe(void *link_handle, int pasid, __u16 tid) +int ocxl_link_update_pe(void *link_handle, unsigned int pasid, __u16 tid) { struct ocxl_link *link = (struct ocxl_link *) link_handle; struct spa *spa = link->spa; @@ -608,7 +608,7 @@ int ocxl_link_update_pe(void *link_handle, int pasid, __u16 tid) return rc; } -int ocxl_link_remove_pe(void *link_handle, int pasid) +int ocxl_link_remove_pe(void *link_handle, unsigned int pasid) { struct ocxl_link *link = (struct ocxl_link *) link_handle; struct spa *spa = link->spa; diff --git a/drivers/misc/ocxl/ocxl_internal.h b/drivers/misc/ocxl/ocxl_internal.h index 345bf843a38e..3ca982ba7472 100644 --- a/drivers/misc/ocxl/ocxl_internal.h +++ b/drivers/misc/ocxl/ocxl_internal.h @@ -41,7 +41,7 @@ struct ocxl_afu { struct ocxl_afu_config config; int pasid_base; int pasid_count; /* opened contexts */ - int pasid_max; /* maximum number of contexts */ + unsigned int pasid_max; /* maximum number of contexts */ int actag_base; int actag_enabled; struct mutex contexts_lock; @@ -69,7 +69,7 @@ struct ocxl_xsl_error { struct ocxl_context { struct ocxl_afu *afu; - int pasid; + unsigned int pasid; struct mutex status_mutex; enum ocxl_context_status status; struct address_space *mapping; @@ -128,7 +128,7 @@ int ocxl_config_check_afu_index(struct pci_dev *dev, * pasid: the PASID for the AFU context * tid: the new thread id for the process element */ -int ocxl_link_update_pe(void *link_handle, int pasid, __u16 tid); +int ocxl_link_update_pe(void *link_handle, unsigned int pasid, __u16 tid); int ocxl_context_mmap(struct ocxl_context *ctx, struct vm_area_struct *vma); diff --git a/drivers/misc/ocxl/pasid.c b/drivers/misc/ocxl/pasid.c index d14cb56e6920..a151fc8f0bec 100644 --- a/drivers/misc/ocxl/pasid.c +++ b/drivers/misc/ocxl/pasid.c @@ -80,7 +80,7 @@ static void range_free(struct list_head *head, u32 start, u32 size, int ocxl_pasid_afu_alloc(struct ocxl_fn *fn, u32 size) { - int max_pasid; + unsigned int max_pasid; if (fn->config.max_pasid_log < 0) return -ENOSPC; diff --git a/drivers/misc/ocxl/trace.h b/drivers/misc/ocxl/trace.h index 17e21cb2addd..019e2fc63b1d 100644 --- a/drivers/misc/ocxl/trace.h +++
Re: [PATCH] cxl: Fix kobject memleak
Le 02/06/2020 à 14:07, Wang Hai a écrit : Currently the error return path from kobject_init_and_add() is not followed by a call to kobject_put() - which means we are leaking the kobject. Fix it by adding a call to kobject_put() in the error path of kobject_init_and_add(). Fixes: b087e6190ddc ("cxl: Export optional AFU configuration record in sysfs") Reported-by: Hulk Robot Signed-off-by: Wang Hai Indeed, a call to kobject_put() is needed when the init fails. Thanks! Acked-by: Frederic Barrat --- drivers/misc/cxl/sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/cxl/sysfs.c b/drivers/misc/cxl/sysfs.c index f0263d1..d97a243 100644 --- a/drivers/misc/cxl/sysfs.c +++ b/drivers/misc/cxl/sysfs.c @@ -624,7 +624,7 @@ static struct afu_config_record *cxl_sysfs_afu_new_cr(struct cxl_afu *afu, int c rc = kobject_init_and_add(>kobj, _config_record_type, >dev.kobj, "cr%i", cr->cr); if (rc) - goto err; + goto err1; rc = sysfs_create_bin_file(>kobj, >config_attr); if (rc)
Re: [PATCH 4/5] ocxl: Add functions to map/unmap LPC memory
diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c index 2874811a4398..9e303a5f4d85 100644 --- a/drivers/misc/ocxl/link.c +++ b/drivers/misc/ocxl/link.c @@ -738,7 +738,7 @@ int ocxl_link_add_lpc_mem(void *link_handle, u64 size) } EXPORT_SYMBOL_GPL(ocxl_link_add_lpc_mem); -u64 ocxl_link_lpc_map(void *link_handle, struct pci_dev *pdev) +u64 ocxl_link_lpc_online(void *link_handle, struct pci_dev *pdev) { struct ocxl_link *link = (struct ocxl_link *) link_handle; A bit of a nitpick, but is there any specific reason to rename with "online" suffix? I'm discovering it myself, but with memory hotplug, "onlining" seems to refer to the second, a.k.a logical memory hotplug phase (as described in Documentation/admin-guide/mm/memory-hotplug.rst). We'll need to worry about it, but the function here is really doing the first phase, a.k.a physical memory hotplug. Fred @@ -759,7 +759,7 @@ u64 ocxl_link_lpc_map(void *link_handle, struct pci_dev *pdev) return link->lpc_mem; } -void ocxl_link_lpc_release(void *link_handle, struct pci_dev *pdev) +void ocxl_link_lpc_offline(void *link_handle, struct pci_dev *pdev) { struct ocxl_link *link = (struct ocxl_link *) link_handle; diff --git a/drivers/misc/ocxl/ocxl_internal.h b/drivers/misc/ocxl/ocxl_internal.h index db2647a90fc8..5656a4aab5b7 100644 --- a/drivers/misc/ocxl/ocxl_internal.h +++ b/drivers/misc/ocxl/ocxl_internal.h @@ -52,6 +52,12 @@ struct ocxl_afu { void __iomem *global_mmio_ptr; u64 pp_mmio_start; void *private; + u64 lpc_base_addr; /* Covers both LPC & special purpose memory */ + struct bin_attribute attr_global_mmio; + struct bin_attribute attr_lpc_mem; + struct resource lpc_res; + struct bin_attribute attr_special_purpose_mem; + struct resource special_purpose_res; }; enum ocxl_context_status { @@ -170,7 +176,7 @@ extern u64 ocxl_link_get_lpc_mem_sz(void *link_handle); * @link_handle: The OpenCAPI link handle * @pdev: A device that is on the link */ -u64 ocxl_link_lpc_map(void *link_handle, struct pci_dev *pdev); +u64 ocxl_link_lpc_online(void *link_handle, struct pci_dev *pdev); /** * Release the LPC memory device for an OpenCAPI device @@ -181,6 +187,6 @@ u64 ocxl_link_lpc_map(void *link_handle, struct pci_dev *pdev); * @link_handle: The OpenCAPI link handle * @pdev: A device that is on the link */ -void ocxl_link_lpc_release(void *link_handle, struct pci_dev *pdev); +void ocxl_link_lpc_offline(void *link_handle, struct pci_dev *pdev); #endif /* _OCXL_INTERNAL_H_ */ diff --git a/include/misc/ocxl.h b/include/misc/ocxl.h index 06dd5839e438..a1897737908d 100644 --- a/include/misc/ocxl.h +++ b/include/misc/ocxl.h @@ -212,6 +212,24 @@ int ocxl_irq_set_handler(struct ocxl_context *ctx, int irq_id, // AFU Metadata +/** + * Map the LPC system & special purpose memory for an AFU + * + * Do not call this during device discovery, as there may me multiple + * devices on a link, and the memory is mapped for the whole link, not + * just one device. It should only be called after all devices have + * registered their memory on the link. + * + * afu: The AFU that has the LPC memory to map + */ +extern int ocxl_map_lpc_mem(struct ocxl_afu *afu); + +/** + * Get the physical address range of LPC memory for an AFU + * afu: The AFU associated with the LPC memory + */ +extern struct resource *ocxl_afu_lpc_mem(struct ocxl_afu *afu); + /** * Get a pointer to the config for an AFU *
Re: [PATCH 3/5] ocxl: Tally up the LPC memory on a link & allow it to be mapped
Le 19/09/2019 à 06:55, Alastair D'Silva a écrit : On Wed, 2019-09-18 at 16:02 +0200, Frederic Barrat wrote: Le 17/09/2019 à 03:42, Alastair D'Silva a écrit : From: Alastair D'Silva Tally up the LPC memory on an OpenCAPI link & allow it to be mapped Signed-off-by: Alastair D'Silva --- drivers/misc/ocxl/core.c | 9 + drivers/misc/ocxl/link.c | 61 +++ drivers/misc/ocxl/ocxl_internal.h | 42 + 3 files changed, 112 insertions(+) diff --git a/drivers/misc/ocxl/core.c b/drivers/misc/ocxl/core.c index b7a09b21ab36..fdfe4e0a34e1 100644 --- a/drivers/misc/ocxl/core.c +++ b/drivers/misc/ocxl/core.c @@ -230,8 +230,17 @@ static int configure_afu(struct ocxl_afu *afu, u8 afu_idx, struct pci_dev *dev) if (rc) goto err_free_pasid; + if (afu->config.lpc_mem_size || afu- config.special_purpose_mem_size) { + rc = ocxl_link_add_lpc_mem(afu->fn->link, + afu->config.lpc_mem_size + afu- config.special_purpose_mem_size); I don't think we should count the special purpose memory, as it's not meant to be accessed through the GPU mem BAR, but I'll check. At least for OpenCAPI 3.0, there is no other in-spec way to access the memory if it is not mapped by the NPU. Yes, that's clarified now and we should take the special purpose memory into account when defining the full range. Fred What happens when unconfiguring the AFU? We should reduce the range (see also below). Partial reconfig doesn't seem so far off, so we should take it into account. The mapping is left until the last AFU on the link offlines it's memory, at which point we clear the mapping from the NPU. + if (rc) + goto err_free_mmio; + } + return 0; +err_free_mmio: + unmap_mmio_areas(afu); err_free_pasid: reclaim_afu_pasid(afu); err_free_actag: diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c index 58d111afd9f6..2874811a4398 100644 --- a/drivers/misc/ocxl/link.c +++ b/drivers/misc/ocxl/link.c @@ -84,6 +84,11 @@ struct ocxl_link { int dev; atomic_t irq_available; struct spa *spa; + struct mutex lpc_mem_lock; + u64 lpc_mem_sz; /* Total amount of LPC memory presented on the link */ + u64 lpc_mem; + int lpc_consumers; + void *platform_data; }; static struct list_head links_list = LIST_HEAD_INIT(links_list); @@ -396,6 +401,8 @@ static int alloc_link(struct pci_dev *dev, int PE_mask, struct ocxl_link **out_l if (rc) goto err_spa; + mutex_init(>lpc_mem_lock); + /* platform specific hook */ rc = pnv_ocxl_spa_setup(dev, link->spa->spa_mem, PE_mask, >platform_data); @@ -711,3 +718,57 @@ void ocxl_link_free_irq(void *link_handle, int hw_irq) atomic_inc(>irq_available); } EXPORT_SYMBOL_GPL(ocxl_link_free_irq); + +int ocxl_link_add_lpc_mem(void *link_handle, u64 size) +{ + struct ocxl_link *link = (struct ocxl_link *) link_handle; + + u64 orig_size; + bool good = false; + + mutex_lock(>lpc_mem_lock); + orig_size = link->lpc_mem_sz; + link->lpc_mem_sz += size; We have a choice to make here: 1. either we only support one LPC memory-carrying AFU (and the above is overkill) 2. or we support multiple AFUs with LPC memory (on the same function), but then I think the above is too simple. From the opencapi spec, each AFU can define a chunk of memory with a starting address and a size. There's no rule which says they have to be contiguous. There's no rule which says it must start at 0. So to support multiple AFUs with LPC memory, we should record the current maximum range instead of just the global size. Ultimately, we need to tell the NPU the range of permissible addresses. It starts at 0, so we need to take into account any intial offset and holes. I would go for option 2, to at least be consistent within ocxl and support multiple AFUs. Even though I don't think we'll see FPGA images with multiple AFUs with LPC memory any time soon. Ill rework this to take an offset & size, the NPU will map from the base address up to the largest offset + size provided across all AFUs on the link. + good = orig_size < link->lpc_mem_sz; + mutex_unlock(>lpc_mem_lock); + + // Check for overflow + return (good) ? 0 : -EINVAL; +} +EXPORT_SYMBOL_GPL(ocxl_link_add_lpc_mem); Do the symbol really need to be exported? IIUC, the next patch defines a higher level ocxl_afu_map_lpc_mem() which is meant to be called by a calling driver. No, I'll remove it. + +u64 ocxl_link_lpc_map(void *link_handle, struct pci_dev *pdev) +{ + struct ocxl_link *link = (struct ocxl_link *) link_handle; + + mutex_lock(>lpc_mem_lock); + if (link->lpc_mem) { +
Re: [PATCH 2/5] powerpc: Map & release OpenCAPI LPC memory
Le 19/09/2019 à 02:58, Alastair D'Silva a écrit : On Wed, 2019-09-18 at 16:03 +0200, Frederic Barrat wrote: Le 17/09/2019 à 03:42, Alastair D'Silva a écrit : From: Alastair D'Silva Map & release OpenCAPI LPC memory. Signed-off-by: Alastair D'Silva --- arch/powerpc/include/asm/pnv-ocxl.h | 2 ++ arch/powerpc/platforms/powernv/ocxl.c | 42 +++ 2 files changed, 44 insertions(+) diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h index 7de82647e761..f8f8ffb48aa8 100644 --- a/arch/powerpc/include/asm/pnv-ocxl.h +++ b/arch/powerpc/include/asm/pnv-ocxl.h @@ -32,5 +32,7 @@ extern int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle) extern int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr); extern void pnv_ocxl_free_xive_irq(u32 irq); +extern u64 pnv_ocxl_platform_lpc_setup(struct pci_dev *pdev, u64 size); +extern void pnv_ocxl_platform_lpc_release(struct pci_dev *pdev); #endif /* _ASM_PNV_OCXL_H */ diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c index 8c65aacda9c8..81393728d6a3 100644 --- a/arch/powerpc/platforms/powernv/ocxl.c +++ b/arch/powerpc/platforms/powernv/ocxl.c @@ -475,6 +475,48 @@ void pnv_ocxl_spa_release(void *platform_data) } EXPORT_SYMBOL_GPL(pnv_ocxl_spa_release); +u64 pnv_ocxl_platform_lpc_setup(struct pci_dev *pdev, u64 size) +{ + struct pci_controller *hose = pci_bus_to_host(pdev->bus); + struct pnv_phb *phb = hose->private_data; + struct pci_dn *pdn = pci_get_pdn(pdev); + u32 bdfn = (pdn->busno << 8) | pdn->devfn; We can spare a call to pci_get_pdn() with bdfn = (pdev->bus->number << 8) | pdev->devfn; Ok. + u64 base_addr = 0; + + int rc = opal_npu_mem_alloc(phb->opal_id, bdfn, size, _addr); + + WARN_ON(rc); Instead of a WARN, we should catch the error and return a null address to the caller. base_addr will be 0 in the error case, are you suggesting we just remove the WARN_ON()? Well, we don't really have any reason to keep going if the opal call fails, right? And anyway, I wouldn't make any assumption on the content of base_addr if the call fails. But my remark was really to avoid polluting the logs with the WARN output. The stack backtrace and register content is scary and is not going to help in that situation. A proper error message is more suitable. Fred + + base_addr = be64_to_cpu(base_addr); + + rc = check_hotplug_memory_addressable(base_addr, base_addr + size); That code is missing? That's added in the following patch on the mm list: [PATCH v3 1/2] memory_hotplug: Add a bounds check to check_hotplug_memory_range() + if (rc) { + dev_warn(>dev, +"LPC memory range 0x%llx-0x%llx is not fully addressable", +base_addr, base_addr + size - 1); + return 0; + } + + + return base_addr; +} +EXPORT_SYMBOL_GPL(pnv_ocxl_platform_lpc_setup); + +void pnv_ocxl_platform_lpc_release(struct pci_dev *pdev) +{ + struct pci_controller *hose = pci_bus_to_host(pdev->bus); + struct pnv_phb *phb = hose->private_data; + struct pci_dn *pdn = pci_get_pdn(pdev); + u32 bdfn; + int rc; + + bdfn = (pdn->busno << 8) | pdn->devfn; + rc = opal_npu_mem_release(phb->opal_id, bdfn); + WARN_ON(rc); Same comments as above. Fred +} +EXPORT_SYMBOL_GPL(pnv_ocxl_platform_lpc_release); + + int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle) { struct spa_data *data = (struct spa_data *) platform_data;
Re: [PATCH 4/5] ocxl: Add functions to map/unmap LPC memory
Le 17/09/2019 à 03:43, Alastair D'Silva a écrit : From: Alastair D'Silva Add functions to map/unmap LPC memory Signed-off-by: Alastair D'Silva --- drivers/misc/ocxl/config.c| 4 +++ drivers/misc/ocxl/core.c | 50 +++ drivers/misc/ocxl/link.c | 4 +-- drivers/misc/ocxl/ocxl_internal.h | 10 +-- include/misc/ocxl.h | 18 +++ 5 files changed, 82 insertions(+), 4 deletions(-) diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c index c8e19bfb5ef9..fb0c3b6f8312 100644 --- a/drivers/misc/ocxl/config.c +++ b/drivers/misc/ocxl/config.c @@ -568,6 +568,10 @@ static int read_afu_lpc_memory_info(struct pci_dev *dev, afu->special_purpose_mem_size = total_mem_size - lpc_mem_size; } + + dev_info(>dev, "Probed LPC memory of %#llx bytes and special purpose memory of %#llx bytes\n", + afu->lpc_mem_size, afu->special_purpose_mem_size); + return 0; } diff --git a/drivers/misc/ocxl/core.c b/drivers/misc/ocxl/core.c index fdfe4e0a34e1..eb24bb9d655f 100644 --- a/drivers/misc/ocxl/core.c +++ b/drivers/misc/ocxl/core.c @@ -210,6 +210,55 @@ static void unmap_mmio_areas(struct ocxl_afu *afu) release_fn_bar(afu->fn, afu->config.global_mmio_bar); } +int ocxl_map_lpc_mem(struct ocxl_afu *afu) +{ + struct pci_dev *dev = to_pci_dev(afu->fn->dev.parent); + + if ((afu->config.lpc_mem_size + afu->config.special_purpose_mem_size) == 0) + return 0; + + afu->lpc_base_addr = ocxl_link_lpc_online(afu->fn->link, dev); + if (afu->lpc_base_addr == 0) + return -EINVAL; + + if (afu->config.lpc_mem_size) { + afu->lpc_res.start = afu->lpc_base_addr + afu->config.lpc_mem_offset; + afu->lpc_res.end = afu->lpc_res.start + afu->config.lpc_mem_size - 1; + } + + if (afu->config.special_purpose_mem_size) { + afu->special_purpose_res.start = afu->lpc_base_addr + + afu->config.special_purpose_mem_offset; + afu->special_purpose_res.end = afu->special_purpose_res.start + + afu->config.special_purpose_mem_size - 1; + } + + return 0; +} +EXPORT_SYMBOL(ocxl_map_lpc_mem); + +struct resource *ocxl_afu_lpc_mem(struct ocxl_afu *afu) +{ + return >lpc_res; +} +EXPORT_SYMBOL(ocxl_afu_lpc_mem); + +static void unmap_lpc_mem(struct ocxl_afu *afu) +{ + struct pci_dev *dev = to_pci_dev(afu->fn->dev.parent); + + if (afu->lpc_res.start || afu->special_purpose_res.start) { + void *link = afu->fn->link; + + ocxl_link_lpc_offline(link, dev); + + afu->lpc_res.start = 0; + afu->lpc_res.end = 0; + afu->special_purpose_res.start = 0; + afu->special_purpose_res.end = 0; + } +} + static int configure_afu(struct ocxl_afu *afu, u8 afu_idx, struct pci_dev *dev) { int rc; @@ -250,6 +299,7 @@ static int configure_afu(struct ocxl_afu *afu, u8 afu_idx, struct pci_dev *dev) static void deconfigure_afu(struct ocxl_afu *afu) { + unmap_lpc_mem(afu); unmap_mmio_areas(afu); reclaim_afu_pasid(afu); reclaim_afu_actag(afu); diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c index 2874811a4398..9e303a5f4d85 100644 --- a/drivers/misc/ocxl/link.c +++ b/drivers/misc/ocxl/link.c @@ -738,7 +738,7 @@ int ocxl_link_add_lpc_mem(void *link_handle, u64 size) } EXPORT_SYMBOL_GPL(ocxl_link_add_lpc_mem); -u64 ocxl_link_lpc_map(void *link_handle, struct pci_dev *pdev) +u64 ocxl_link_lpc_online(void *link_handle, struct pci_dev *pdev) { struct ocxl_link *link = (struct ocxl_link *) link_handle; @@ -759,7 +759,7 @@ u64 ocxl_link_lpc_map(void *link_handle, struct pci_dev *pdev) return link->lpc_mem; } -void ocxl_link_lpc_release(void *link_handle, struct pci_dev *pdev) +void ocxl_link_lpc_offline(void *link_handle, struct pci_dev *pdev) Could we avoid the renaming by squashing it with the previous patch? { struct ocxl_link *link = (struct ocxl_link *) link_handle; diff --git a/drivers/misc/ocxl/ocxl_internal.h b/drivers/misc/ocxl/ocxl_internal.h index db2647a90fc8..5656a4aab5b7 100644 --- a/drivers/misc/ocxl/ocxl_internal.h +++ b/drivers/misc/ocxl/ocxl_internal.h @@ -52,6 +52,12 @@ struct ocxl_afu { void __iomem *global_mmio_ptr; u64 pp_mmio_start; void *private; + u64 lpc_base_addr; /* Covers both LPC & special purpose memory */ + struct bin_attribute attr_global_mmio; + struct bin_attribute attr_lpc_mem; + struct resource lpc_res; + struct bin_attribute attr_special_purpose_mem; + struct resource special_purpose_res; }; enum ocxl_context_status { @@ -170,7 +176,7 @@ extern u64
Re: [PATCH 3/5] ocxl: Tally up the LPC memory on a link & allow it to be mapped
Le 17/09/2019 à 03:42, Alastair D'Silva a écrit : From: Alastair D'Silva Tally up the LPC memory on an OpenCAPI link & allow it to be mapped Signed-off-by: Alastair D'Silva --- drivers/misc/ocxl/core.c | 9 + drivers/misc/ocxl/link.c | 61 +++ drivers/misc/ocxl/ocxl_internal.h | 42 + 3 files changed, 112 insertions(+) diff --git a/drivers/misc/ocxl/core.c b/drivers/misc/ocxl/core.c index b7a09b21ab36..fdfe4e0a34e1 100644 --- a/drivers/misc/ocxl/core.c +++ b/drivers/misc/ocxl/core.c @@ -230,8 +230,17 @@ static int configure_afu(struct ocxl_afu *afu, u8 afu_idx, struct pci_dev *dev) if (rc) goto err_free_pasid; + if (afu->config.lpc_mem_size || afu->config.special_purpose_mem_size) { + rc = ocxl_link_add_lpc_mem(afu->fn->link, + afu->config.lpc_mem_size + afu->config.special_purpose_mem_size); I don't think we should count the special purpose memory, as it's not meant to be accessed through the GPU mem BAR, but I'll check. What happens when unconfiguring the AFU? We should reduce the range (see also below). Partial reconfig doesn't seem so far off, so we should take it into account. + if (rc) + goto err_free_mmio; + } + return 0; +err_free_mmio: + unmap_mmio_areas(afu); err_free_pasid: reclaim_afu_pasid(afu); err_free_actag: diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c index 58d111afd9f6..2874811a4398 100644 --- a/drivers/misc/ocxl/link.c +++ b/drivers/misc/ocxl/link.c @@ -84,6 +84,11 @@ struct ocxl_link { int dev; atomic_t irq_available; struct spa *spa; + struct mutex lpc_mem_lock; + u64 lpc_mem_sz; /* Total amount of LPC memory presented on the link */ + u64 lpc_mem; + int lpc_consumers; + void *platform_data; }; static struct list_head links_list = LIST_HEAD_INIT(links_list); @@ -396,6 +401,8 @@ static int alloc_link(struct pci_dev *dev, int PE_mask, struct ocxl_link **out_l if (rc) goto err_spa; + mutex_init(>lpc_mem_lock); + /* platform specific hook */ rc = pnv_ocxl_spa_setup(dev, link->spa->spa_mem, PE_mask, >platform_data); @@ -711,3 +718,57 @@ void ocxl_link_free_irq(void *link_handle, int hw_irq) atomic_inc(>irq_available); } EXPORT_SYMBOL_GPL(ocxl_link_free_irq); + +int ocxl_link_add_lpc_mem(void *link_handle, u64 size) +{ + struct ocxl_link *link = (struct ocxl_link *) link_handle; + + u64 orig_size; + bool good = false; + + mutex_lock(>lpc_mem_lock); + orig_size = link->lpc_mem_sz; + link->lpc_mem_sz += size; We have a choice to make here: 1. either we only support one LPC memory-carrying AFU (and the above is overkill) 2. or we support multiple AFUs with LPC memory (on the same function), but then I think the above is too simple. From the opencapi spec, each AFU can define a chunk of memory with a starting address and a size. There's no rule which says they have to be contiguous. There's no rule which says it must start at 0. So to support multiple AFUs with LPC memory, we should record the current maximum range instead of just the global size. Ultimately, we need to tell the NPU the range of permissible addresses. It starts at 0, so we need to take into account any intial offset and holes. I would go for option 2, to at least be consistent within ocxl and support multiple AFUs. Even though I don't think we'll see FPGA images with multiple AFUs with LPC memory any time soon. + good = orig_size < link->lpc_mem_sz; + mutex_unlock(>lpc_mem_lock); + + // Check for overflow + return (good) ? 0 : -EINVAL; +} +EXPORT_SYMBOL_GPL(ocxl_link_add_lpc_mem); Do the symbol really need to be exported? IIUC, the next patch defines a higher level ocxl_afu_map_lpc_mem() which is meant to be called by a calling driver. + +u64 ocxl_link_lpc_map(void *link_handle, struct pci_dev *pdev) +{ + struct ocxl_link *link = (struct ocxl_link *) link_handle; + + mutex_lock(>lpc_mem_lock); + if (link->lpc_mem) { + u64 lpc_mem = link->lpc_mem; + + link->lpc_consumers++; + mutex_unlock(>lpc_mem_lock); + return lpc_mem; + } + + link->lpc_mem = pnv_ocxl_platform_lpc_setup(pdev, link->lpc_mem_sz); + if (link->lpc_mem) + link->lpc_consumers++; + mutex_unlock(>lpc_mem_lock); + + return link->lpc_mem; Should be cached in a temp variable, like on the fast path, otherwise it's accessed with no lock. +} + +void ocxl_link_lpc_release(void *link_handle, struct pci_dev *pdev) +{ + struct ocxl_link *link = (struct ocxl_link *) link_handle; + + mutex_lock(>lpc_mem_lock); + link->lpc_consumers--; + if
Re: [PATCH 2/5] powerpc: Map & release OpenCAPI LPC memory
Le 17/09/2019 à 03:42, Alastair D'Silva a écrit : From: Alastair D'Silva Map & release OpenCAPI LPC memory. Signed-off-by: Alastair D'Silva --- arch/powerpc/include/asm/pnv-ocxl.h | 2 ++ arch/powerpc/platforms/powernv/ocxl.c | 42 +++ 2 files changed, 44 insertions(+) diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h index 7de82647e761..f8f8ffb48aa8 100644 --- a/arch/powerpc/include/asm/pnv-ocxl.h +++ b/arch/powerpc/include/asm/pnv-ocxl.h @@ -32,5 +32,7 @@ extern int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle) extern int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr); extern void pnv_ocxl_free_xive_irq(u32 irq); +extern u64 pnv_ocxl_platform_lpc_setup(struct pci_dev *pdev, u64 size); +extern void pnv_ocxl_platform_lpc_release(struct pci_dev *pdev); #endif /* _ASM_PNV_OCXL_H */ diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c index 8c65aacda9c8..81393728d6a3 100644 --- a/arch/powerpc/platforms/powernv/ocxl.c +++ b/arch/powerpc/platforms/powernv/ocxl.c @@ -475,6 +475,48 @@ void pnv_ocxl_spa_release(void *platform_data) } EXPORT_SYMBOL_GPL(pnv_ocxl_spa_release); +u64 pnv_ocxl_platform_lpc_setup(struct pci_dev *pdev, u64 size) +{ + struct pci_controller *hose = pci_bus_to_host(pdev->bus); + struct pnv_phb *phb = hose->private_data; + struct pci_dn *pdn = pci_get_pdn(pdev); + u32 bdfn = (pdn->busno << 8) | pdn->devfn; We can spare a call to pci_get_pdn() with bdfn = (pdev->bus->number << 8) | pdev->devfn; + u64 base_addr = 0; + + int rc = opal_npu_mem_alloc(phb->opal_id, bdfn, size, _addr); + + WARN_ON(rc); Instead of a WARN, we should catch the error and return a null address to the caller. + + base_addr = be64_to_cpu(base_addr); + + rc = check_hotplug_memory_addressable(base_addr, base_addr + size); That code is missing? + if (rc) { + dev_warn(>dev, +"LPC memory range 0x%llx-0x%llx is not fully addressable", +base_addr, base_addr + size - 1); + return 0; + } + + + return base_addr; +} +EXPORT_SYMBOL_GPL(pnv_ocxl_platform_lpc_setup); + +void pnv_ocxl_platform_lpc_release(struct pci_dev *pdev) +{ + struct pci_controller *hose = pci_bus_to_host(pdev->bus); + struct pnv_phb *phb = hose->private_data; + struct pci_dn *pdn = pci_get_pdn(pdev); + u32 bdfn; + int rc; + + bdfn = (pdn->busno << 8) | pdn->devfn; + rc = opal_npu_mem_release(phb->opal_id, bdfn); + WARN_ON(rc); Same comments as above. Fred +} +EXPORT_SYMBOL_GPL(pnv_ocxl_platform_lpc_release); + + int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle) { struct spa_data *data = (struct spa_data *) platform_data;
Re: [PATCH 2/4] powerpc/powernv: remove the unused tunneling exports
Le 21/06/2019 à 03:47, Oliver O'Halloran a écrit : On Thu, May 23, 2019 at 5:51 PM Christoph Hellwig wrote: These have been unused ever since they've been added to the kernel. Signed-off-by: Christoph Hellwig --- arch/powerpc/include/asm/pnv-pci.h| 4 -- arch/powerpc/platforms/powernv/pci-ioda.c | 4 +- arch/powerpc/platforms/powernv/pci.c | 71 --- arch/powerpc/platforms/powernv/pci.h | 1 - 4 files changed, 3 insertions(+), 77 deletions(-) diff --git a/arch/powerpc/include/asm/pnv-pci.h b/arch/powerpc/include/asm/pnv-pci.h index 9fcb0bc462c6..1ab4b0111abc 100644 --- a/arch/powerpc/include/asm/pnv-pci.h +++ b/arch/powerpc/include/asm/pnv-pci.h @@ -27,12 +27,8 @@ extern int pnv_pci_get_power_state(uint64_t id, uint8_t *state); extern int pnv_pci_set_power_state(uint64_t id, uint8_t state, struct opal_msg *msg); -extern int pnv_pci_enable_tunnel(struct pci_dev *dev, uint64_t *asnind); -extern int pnv_pci_disable_tunnel(struct pci_dev *dev); extern int pnv_pci_set_tunnel_bar(struct pci_dev *dev, uint64_t addr, int enable); -extern int pnv_pci_get_as_notify_info(struct task_struct *task, u32 *lpid, - u32 *pid, u32 *tid); IIRC as-notify was for CAPI which has an in-tree driver (cxl). Fred or Andrew (+cc), what's going on with this? Will it ever see the light of day? The as-notify can be used in both CAPI mode and PCI mode. In capi mode, it's integrated in the capi protocol, so the cxl driver doesn't need to do extra setup, compared to what's already done to activate capi. As mentioned in a previous iteration of that patchset, those APIs are to be used by the Mellanox CX5 driver. The in-tree driver is always a step behind their latest, but word is they are working on upstreaming those interactions. Fred int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t mode); int pnv_cxl_ioda_msi_setup(struct pci_dev *dev, unsigned int hwirq, unsigned int virq); diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 126602b4e399..6b0caa2d0425 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -54,6 +54,8 @@ static const char * const pnv_phb_names[] = { "IODA1", "IODA2", "NPU_NVLINK", "NPU_OCAPI" }; +static void pnv_pci_ioda2_set_bypass(struct pnv_ioda_pe *pe, bool enable); + void pe_level_printk(const struct pnv_ioda_pe *pe, const char *level, const char *fmt, ...) { @@ -2360,7 +2362,7 @@ static long pnv_pci_ioda2_set_window(struct iommu_table_group *table_group, return 0; } -void pnv_pci_ioda2_set_bypass(struct pnv_ioda_pe *pe, bool enable) +static void pnv_pci_ioda2_set_bypass(struct pnv_ioda_pe *pe, bool enable) { uint16_t window_id = (pe->pe_number << 1 ) + 1; int64_t rc; diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c index 8d28f2932c3b..fc69f5611020 100644 --- a/arch/powerpc/platforms/powernv/pci.c +++ b/arch/powerpc/platforms/powernv/pci.c @@ -868,54 +868,6 @@ struct device_node *pnv_pci_get_phb_node(struct pci_dev *dev) } EXPORT_SYMBOL(pnv_pci_get_phb_node); -int pnv_pci_enable_tunnel(struct pci_dev *dev, u64 *asnind) -{ - struct device_node *np; - const __be32 *prop; - struct pnv_ioda_pe *pe; - uint16_t window_id; - int rc; - - if (!radix_enabled()) - return -ENXIO; - - if (!(np = pnv_pci_get_phb_node(dev))) - return -ENXIO; - - prop = of_get_property(np, "ibm,phb-indications", NULL); - of_node_put(np); - - if (!prop || !prop[1]) - return -ENXIO; - - *asnind = (u64)be32_to_cpu(prop[1]); - pe = pnv_ioda_get_pe(dev); - if (!pe) - return -ENODEV; - - /* Increase real window size to accept as_notify messages. */ - window_id = (pe->pe_number << 1 ) + 1; - rc = opal_pci_map_pe_dma_window_real(pe->phb->opal_id, pe->pe_number, -window_id, pe->tce_bypass_base, -(uint64_t)1 << 48); - return opal_error_code(rc); -} -EXPORT_SYMBOL_GPL(pnv_pci_enable_tunnel); - -int pnv_pci_disable_tunnel(struct pci_dev *dev) -{ - struct pnv_ioda_pe *pe; - - pe = pnv_ioda_get_pe(dev); - if (!pe) - return -ENODEV; - - /* Restore default real window size. */ - pnv_pci_ioda2_set_bypass(pe, true); - return 0; -} -EXPORT_SYMBOL_GPL(pnv_pci_disable_tunnel); - int pnv_pci_set_tunnel_bar(struct pci_dev *dev, u64 addr, int enable) { __be64 val; @@ -970,29 +922,6 @@ int pnv_pci_set_tunnel_bar(struct pci_dev *dev, u64 addr, int enable) }
Re: [PATCH v2] ocxl: Allow contexts to be attached with a NULL mm
Le 20/06/2019 à 06:12, Alastair D'Silva a écrit : From: Alastair D'Silva If an OpenCAPI context is to be used directly by a kernel driver, there may not be a suitable mm to use. The patch makes the mm parameter to ocxl_context_attach optional. Signed-off-by: Alastair D'Silva --- Thanks for the update. Acked-by: Frederic Barrat arch/powerpc/mm/book3s64/radix_tlb.c | 5 + drivers/misc/ocxl/context.c | 9 ++--- drivers/misc/ocxl/link.c | 28 3 files changed, 35 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c index bb9835681315..ce8a77fae6a7 100644 --- a/arch/powerpc/mm/book3s64/radix_tlb.c +++ b/arch/powerpc/mm/book3s64/radix_tlb.c @@ -666,6 +666,11 @@ EXPORT_SYMBOL(radix__flush_tlb_page); #define radix__flush_all_mm radix__local_flush_all_mm #endif /* CONFIG_SMP */ +/* + * If kernel TLBIs ever become local rather than global, then + * drivers/misc/ocxl/link.c:ocxl_link_add_pe will need some work, as it + * assumes kernel TLBIs are global. + */ void radix__flush_tlb_kernel_range(unsigned long start, unsigned long end) { _tlbie_pid(0, RIC_FLUSH_ALL); diff --git a/drivers/misc/ocxl/context.c b/drivers/misc/ocxl/context.c index bab9c9364184..994563a078eb 100644 --- a/drivers/misc/ocxl/context.c +++ b/drivers/misc/ocxl/context.c @@ -69,6 +69,7 @@ static void xsl_fault_error(void *data, u64 addr, u64 dsisr) int ocxl_context_attach(struct ocxl_context *ctx, u64 amr, struct mm_struct *mm) { int rc; + unsigned long pidr = 0; // Locks both status & tidr mutex_lock(>status_mutex); @@ -77,9 +78,11 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr, struct mm_struct *mm) goto out; } - rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid, - mm->context.id, ctx->tidr, amr, mm, - xsl_fault_error, ctx); + if (mm) + pidr = mm->context.id; + + rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid, pidr, ctx->tidr, + amr, mm, xsl_fault_error, ctx); if (rc) goto out; diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c index cce5b0d64505..58d111afd9f6 100644 --- a/drivers/misc/ocxl/link.c +++ b/drivers/misc/ocxl/link.c @@ -224,6 +224,17 @@ static irqreturn_t xsl_fault_handler(int irq, void *data) ack_irq(spa, ADDRESS_ERROR); return IRQ_HANDLED; } + + if (!pe_data->mm) { + /* +* translation fault from a kernel context - an OpenCAPI +* device tried to access a bad kernel address +*/ + rcu_read_unlock(); + pr_warn("Unresolved OpenCAPI xsl fault in kernel context\n"); + ack_irq(spa, ADDRESS_ERROR); + return IRQ_HANDLED; + } WARN_ON(pe_data->mm->context.id != pid); if (mmget_not_zero(pe_data->mm)) { @@ -523,7 +534,13 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr, pe->amr = cpu_to_be64(amr); pe->software_state = cpu_to_be32(SPA_PE_VALID); - mm_context_add_copro(mm); + /* +* For user contexts, register a copro so that TLBIs are seen +* by the nest MMU. If we have a kernel context, TLBIs are +* already global. +*/ + if (mm) + mm_context_add_copro(mm); /* * Barrier is to make sure PE is visible in the SPA before it * is used by the device. It also helps with the global TLBI @@ -546,7 +563,8 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr, * have a reference on mm_users. Incrementing mm_count solves * the problem. */ - mmgrab(mm); + if (mm) + mmgrab(mm); trace_ocxl_context_add(current->pid, spa->spa_mem, pasid, pidr, tidr); unlock: mutex_unlock(>spa_lock); @@ -652,8 +670,10 @@ int ocxl_link_remove_pe(void *link_handle, int pasid) if (!pe_data) { WARN(1, "Couldn't find pe data when removing PE\n"); } else { - mm_context_remove_copro(pe_data->mm); - mmdrop(pe_data->mm); + if (pe_data->mm) { + mm_context_remove_copro(pe_data->mm); + mmdrop(pe_data->mm); + } kfree_rcu(pe_data, rcu); } unlock:
Re: [PATCH] ocxl: Allow contexts to be attached with a NULL mm
Le 18/06/2019 à 03:50, Andrew Donnellan a écrit : On 17/6/19 2:41 pm, Alastair D'Silva wrote: From: Alastair D'Silva If an OpenCAPI context is to be used directly by a kernel driver, there may not be a suitable mm to use. The patch makes the mm parameter to ocxl_context_attach optional. Signed-off-by: Alastair D'Silva The one issue I can see here is that using mm == NULL bypasses our method of enabling/disabling global TLBIs in mm_context_add_copro(). Discussing this privately with Alastair and Fred - this should be fine, but perhaps we should document that. So indeed we should be fine. I confirmed with Nick that kernel space invalidations are already global today. Nick mentioned that we should still be fine tomorrow, but in the distant future, we could imagine local usage of some part of the kernel space. It will require some work, but it would be best to add a comment in one of the kernel invalidation function (for example radix__flush_tlb_kernel_range()) that if a kernel invalidation ever becomes local, then clients of the nest MMU may need some work. A few more comments below. --- drivers/misc/ocxl/context.c | 9 ++--- drivers/misc/ocxl/link.c | 12 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/drivers/misc/ocxl/context.c b/drivers/misc/ocxl/context.c index bab9c9364184..994563a078eb 100644 --- a/drivers/misc/ocxl/context.c +++ b/drivers/misc/ocxl/context.c @@ -69,6 +69,7 @@ static void xsl_fault_error(void *data, u64 addr, u64 dsisr) int ocxl_context_attach(struct ocxl_context *ctx, u64 amr, struct mm_struct *mm) { int rc; + unsigned long pidr = 0; // Locks both status & tidr mutex_lock(>status_mutex); @@ -77,9 +78,11 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr, struct mm_struct *mm) goto out; } - rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid, - mm->context.id, ctx->tidr, amr, mm, - xsl_fault_error, ctx); + if (mm) + pidr = mm->context.id; + + rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid, pidr, ctx->tidr, + amr, mm, xsl_fault_error, ctx); if (rc) goto out; diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c index cce5b0d64505..43542f124807 100644 --- a/drivers/misc/ocxl/link.c +++ b/drivers/misc/ocxl/link.c @@ -523,7 +523,8 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr, pe->amr = cpu_to_be64(amr); pe->software_state = cpu_to_be32(SPA_PE_VALID); - mm_context_add_copro(mm); + if (mm) + mm_context_add_copro(mm); Same as above, we should add a comment here in the driver code that a kernel context is ok because invalidations are global. We also need a new check in xsl_fault_handler(). A valid kernel address shouldn't fault, but it's still possible for the FPGA to try accessing a bogus kernel address. In which case, xsl_fault_handler() would be entered, with a valid fault context. We'll find pe_data in the tree based on the valid pe_handle, but pe_data->mm will be NULL. In that, we can return early, acknowledging the interrupt with ADDRESS_ERROR value (like we do if pe_data is not found in the tree). Fred /* * Barrier is to make sure PE is visible in the SPA before it * is used by the device. It also helps with the global TLBI @@ -546,7 +547,8 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr, * have a reference on mm_users. Incrementing mm_count solves * the problem. */ - mmgrab(mm); + if (mm) + mmgrab(mm); trace_ocxl_context_add(current->pid, spa->spa_mem, pasid, pidr, tidr); unlock: mutex_unlock(>spa_lock); @@ -652,8 +654,10 @@ int ocxl_link_remove_pe(void *link_handle, int pasid) if (!pe_data) { WARN(1, "Couldn't find pe data when removing PE\n"); } else { - mm_context_remove_copro(pe_data->mm); - mmdrop(pe_data->mm); + if (pe_data->mm) { + mm_context_remove_copro(pe_data->mm); + mmdrop(pe_data->mm); + } kfree_rcu(pe_data, rcu); } unlock:
Re: [PATCH] ocxl: do not use C++ style comments in uapi header
Le 04/06/2019 à 13:16, Masahiro Yamada a écrit : Linux kernel tolerates C++ style comments these days. Actually, the SPDX License tags for .c files start with //. On the other hand, uapi headers are written in more strict C, where the C++ comment style is forbidden. Signed-off-by: Masahiro Yamada --- Thanks! Acked-by: Frederic Barrat include/uapi/misc/ocxl.h | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/include/uapi/misc/ocxl.h b/include/uapi/misc/ocxl.h index 97937cfa3baa..6d29a60a896a 100644 --- a/include/uapi/misc/ocxl.h +++ b/include/uapi/misc/ocxl.h @@ -33,23 +33,23 @@ struct ocxl_ioctl_attach { }; struct ocxl_ioctl_metadata { - __u16 version; // struct version, always backwards compatible + __u16 version; /* struct version, always backwards compatible */ - // Version 0 fields + /* Version 0 fields */ __u8 afu_version_major; __u8 afu_version_minor; - __u32 pasid;// PASID assigned to the current context + __u32 pasid;/* PASID assigned to the current context */ - __u64 pp_mmio_size; // Per PASID MMIO size + __u64 pp_mmio_size; /* Per PASID MMIO size */ __u64 global_mmio_size; - // End version 0 fields + /* End version 0 fields */ - __u64 reserved[13]; // Total of 16*u64 + __u64 reserved[13]; /* Total of 16*u64 */ }; struct ocxl_ioctl_p9_wait { - __u16 thread_id; // The thread ID required to wake this thread + __u16 thread_id; /* The thread ID required to wake this thread */ __u16 reserved1; __u32 reserved2; __u64 reserved3[3];
Re: [PATCH] misc: remove redundant 'default n' from Kconfig-s
Le 20/05/2019 à 16:10, Bartlomiej Zolnierkiewicz a écrit : 'default n' is the default value for any bool or tristate Kconfig setting so there is no need to write it explicitly. Also since commit f467c5640c29 ("kconfig: only write '# CONFIG_FOO is not set' for visible symbols") the Kconfig behavior is the same regardless of 'default n' being present or not: ... One side effect of (and the main motivation for) this change is making the following two definitions behave exactly the same: config FOO bool config FOO bool default n With this change, neither of these will generate a '# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied). That might make it clearer to people that a bare 'default n' is redundant. ... Signed-off-by: Bartlomiej Zolnierkiewicz --- for cxl and ocxl: Acked-by: Frederic Barrat drivers/misc/Kconfig | 10 -- drivers/misc/altera-stapl/Kconfig |1 - drivers/misc/c2port/Kconfig |2 -- drivers/misc/cb710/Kconfig|1 - drivers/misc/cxl/Kconfig |3 --- drivers/misc/echo/Kconfig |1 - drivers/misc/genwqe/Kconfig |1 - drivers/misc/lis3lv02d/Kconfig|2 -- drivers/misc/ocxl/Kconfig |1 - 9 files changed, 22 deletions(-) Index: b/drivers/misc/Kconfig === --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -8,7 +8,6 @@ config SENSORS_LIS3LV02D tristate depends on INPUT select INPUT_POLLDEV - default n config AD525X_DPOT tristate "Analog Devices Digital Potentiometers" @@ -61,7 +60,6 @@ config ATMEL_TCLIB config DUMMY_IRQ tristate "Dummy IRQ handler" - default n ---help--- This module accepts a single 'irq' parameter, which it should register for. The sole purpose of this module is to help with debugging of systems on @@ -117,7 +115,6 @@ config PHANTOM config INTEL_MID_PTI tristate "Parallel Trace Interface for MIPI P1149.7 cJTAG standard" depends on PCI && TTY && (X86_INTEL_MID || COMPILE_TEST) - default n help The PTI (Parallel Trace Interface) driver directs trace data routed from various parts in the system out @@ -193,7 +190,6 @@ config ATMEL_SSC config ENCLOSURE_SERVICES tristate "Enclosure Services" - default n help Provides support for intelligent enclosures (bays which contain storage devices). You also need either a host @@ -217,7 +213,6 @@ config SGI_XP config CS5535_MFGPT tristate "CS5535/CS5536 Geode Multi-Function General Purpose Timer (MFGPT) support" depends on MFD_CS5535 - default n help This driver provides access to MFGPT functionality for other drivers that need timers. MFGPTs are available in the CS5535 and @@ -250,7 +245,6 @@ config CS5535_CLOCK_EVENT_SRC config HP_ILO tristate "Channel interface driver for the HP iLO processor" depends on PCI - default n help The channel interface driver allows applications to communicate with iLO management processors present on HP ProLiant servers. @@ -285,7 +279,6 @@ config QCOM_FASTRPC config SGI_GRU tristate "SGI GRU driver" depends on X86_UV && SMP - default n select MMU_NOTIFIER ---help--- The GRU is a hardware resource located in the system chipset. The GRU @@ -300,7 +293,6 @@ config SGI_GRU config SGI_GRU_DEBUG bool "SGI GRU driver debug" depends on SGI_GRU - default n ---help--- This option enables additional debugging code for the SGI GRU driver. If you are unsure, say N. @@ -358,7 +350,6 @@ config SENSORS_BH1770 config SENSORS_APDS990X tristate "APDS990X combined als and proximity sensors" depends on I2C -default n ---help--- Say Y here if you want to build a driver for Avago APDS990x combined ambient light and proximity sensor chip. @@ -386,7 +377,6 @@ config DS1682 config SPEAR13XX_PCIE_GADGET bool "PCIe gadget support for SPEAr13XX platform" depends on ARCH_SPEAR13XX && BROKEN - default n help This option enables gadget support for PCIe controller. If board file defines any controller as PCIe endpoint then a sysfs Index: b/drivers/misc/altera-stapl/Kconfig === --- a/drivers/misc/altera-stapl/Kconfig +++ b/drivers/misc/altera-stapl/Kconfig @@ -4,6 +4,5 @@ comment "Altera FPGA firm
Re: [PATCH v4 0/4] ocxl: OpenCAPI Cleanup
Le 25/03/2019 à 17:49, Greg Kurz a écrit : Hi Alastair, I forgot to mention it during v3 but please don't link new version of a patchset to the previous one with --in-reply-to. This is to ensure I can see them in my email client without having to scroll back many days in the past (which likely means a fair number of e-mails on linuxppc-dev). I'm also seeing the other series (ocxl refactoring) somehow under the same thread. I haven't checked why, and there may be some mail client bug in the way, I just mention it in case you see a reason for it in the way you submit the patch set. Fred Cheers, -- Greg On Mon, 25 Mar 2019 16:34:51 +1100 "Alastair D'Silva" wrote: From: Alastair D'Silva Some minor cleanups for the OpenCAPI driver as a prerequisite for an ocxl driver refactoring to allow the driver core to be utilised by external drivers. Changelog: V4: - Drop printf format changes as they omit the format indicator for '0' V3: - Add missed header in 'ocxl: Remove some unused exported symbols'. This addresses the introduced sparse warnings V2: - remove intermediate assignment of 'link' var in 'Rename struct link to ocxl_link' - Don't shift definition of ocxl_context_attach in 'Remove some unused exported symbols' Alastair D'Silva (4): ocxl: Rename struct link to ocxl_link ocxl: read_pasid never returns an error, so make it void ocxl: Remove superfluous 'extern' from headers ocxl: Remove some unused exported symbols drivers/misc/ocxl/config.c| 13 ++--- drivers/misc/ocxl/file.c | 5 +- drivers/misc/ocxl/link.c | 36 ++--- drivers/misc/ocxl/ocxl_internal.h | 85 +++ include/misc/ocxl.h | 53 ++- 5 files changed, 91 insertions(+), 101 deletions(-)
Re: [PATCH v3 7/7] ocxl: Provide global MMIO accessors for external drivers
Le 25/03/2019 à 06:44, Alastair D'Silva a écrit : From: Alastair D'Silva External drivers that communicate via OpenCAPI will need to make MMIO calls to interact with the devices. Signed-off-by: Alastair D'Silva Reviewed-by: Greg Kurz --- Acked-by: Frederic Barrat drivers/misc/ocxl/Makefile | 2 +- drivers/misc/ocxl/mmio.c | 234 + include/misc/ocxl.h| 110 + 3 files changed, 345 insertions(+), 1 deletion(-) create mode 100644 drivers/misc/ocxl/mmio.c diff --git a/drivers/misc/ocxl/Makefile b/drivers/misc/ocxl/Makefile index bc4e39bfda7b..d07d1bb8e8d4 100644 --- a/drivers/misc/ocxl/Makefile +++ b/drivers/misc/ocxl/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0+ ccflags-$(CONFIG_PPC_WERROR) += -Werror -ocxl-y+= main.o pci.o config.o file.o pasid.o +ocxl-y += main.o pci.o config.o file.o pasid.o mmio.o ocxl-y+= link.o context.o afu_irq.o sysfs.o trace.o ocxl-y+= core.o obj-$(CONFIG_OCXL)+= ocxl.o diff --git a/drivers/misc/ocxl/mmio.c b/drivers/misc/ocxl/mmio.c new file mode 100644 index ..aae713db4ebe --- /dev/null +++ b/drivers/misc/ocxl/mmio.c @@ -0,0 +1,234 @@ +// SPDX-License-Identifier: GPL-2.0+ +// Copyright 2019 IBM Corp. +#include +#include "trace.h" +#include "ocxl_internal.h" + +int ocxl_global_mmio_read32(struct ocxl_afu *afu, size_t offset, + enum ocxl_endian endian, u32 *val) +{ + if (offset > afu->config.global_mmio_size - 4) + return -EINVAL; + +#ifdef __BIG_ENDIAN__ + if (endian == OCXL_HOST_ENDIAN) + endian = OCXL_BIG_ENDIAN; +#endif + + switch (endian) { + case OCXL_BIG_ENDIAN: + *val = readl_be((char *)afu->global_mmio_ptr + offset); + break; + + default: + *val = readl((char *)afu->global_mmio_ptr + offset); + break; + } + + return 0; +} +EXPORT_SYMBOL_GPL(ocxl_global_mmio_read32); + +int ocxl_global_mmio_read64(struct ocxl_afu *afu, size_t offset, + enum ocxl_endian endian, u64 *val) +{ + if (offset > afu->config.global_mmio_size - 8) + return -EINVAL; + +#ifdef __BIG_ENDIAN__ + if (endian == OCXL_HOST_ENDIAN) + endian = OCXL_BIG_ENDIAN; +#endif + + switch (endian) { + case OCXL_BIG_ENDIAN: + *val = readq_be((char *)afu->global_mmio_ptr + offset); + break; + + default: + *val = readq((char *)afu->global_mmio_ptr + offset); + break; + } + + return 0; +} +EXPORT_SYMBOL_GPL(ocxl_global_mmio_read64); + +int ocxl_global_mmio_write32(struct ocxl_afu *afu, size_t offset, + enum ocxl_endian endian, u32 val) +{ + if (offset > afu->config.global_mmio_size - 4) + return -EINVAL; + +#ifdef __BIG_ENDIAN__ + if (endian == OCXL_HOST_ENDIAN) + endian = OCXL_BIG_ENDIAN; +#endif + + switch (endian) { + case OCXL_BIG_ENDIAN: + writel_be(val, (char *)afu->global_mmio_ptr + offset); + break; + + default: + writel(val, (char *)afu->global_mmio_ptr + offset); + break; + } + + + return 0; +} +EXPORT_SYMBOL_GPL(ocxl_global_mmio_write32); + +int ocxl_global_mmio_write64(struct ocxl_afu *afu, size_t offset, + enum ocxl_endian endian, u64 val) +{ + if (offset > afu->config.global_mmio_size - 8) + return -EINVAL; + +#ifdef __BIG_ENDIAN__ + if (endian == OCXL_HOST_ENDIAN) + endian = OCXL_BIG_ENDIAN; +#endif + + switch (endian) { + case OCXL_BIG_ENDIAN: + writeq_be(val, (char *)afu->global_mmio_ptr + offset); + break; + + default: + writeq(val, (char *)afu->global_mmio_ptr + offset); + break; + } + + + return 0; +} +EXPORT_SYMBOL_GPL(ocxl_global_mmio_write64); + +int ocxl_global_mmio_set32(struct ocxl_afu *afu, size_t offset, + enum ocxl_endian endian, u32 mask) +{ + u32 tmp; + + if (offset > afu->config.global_mmio_size - 4) + return -EINVAL; + +#ifdef __BIG_ENDIAN__ + if (endian == OCXL_HOST_ENDIAN) + endian = OCXL_BIG_ENDIAN; +#endif + + switch (endian) { + case OCXL_BIG_ENDIAN: + tmp = readl_be((char *)afu->global_mmio_ptr + offset); + tmp |= mask; + writel_be(tmp, (char *)afu->global_mmio_ptr + offset); + break; + + default: + tmp = readl((char *)afu->global_mmio_ptr + offset); + tmp |= mask; +
Re: [PATCH v3 4/7] ocxl: Allow external drivers to use OpenCAPI contexts
Le 25/03/2019 à 06:44, Alastair D'Silva a écrit : From: Alastair D'Silva Most OpenCAPI operations require a valid context, so exposing these functions to external drivers is necessary. Signed-off-by: Alastair D'Silva Reviewed-by: Greg Kurz --- See comment on previous patch regarding merging ocxl_context_alloc() and ocxl_context_init(), it would impact the exported symbols. Fred drivers/misc/ocxl/context.c | 9 +-- drivers/misc/ocxl/file.c | 2 +- drivers/misc/ocxl/ocxl_internal.h | 6 - include/misc/ocxl.h | 45 +++ 4 files changed, 53 insertions(+), 9 deletions(-) diff --git a/drivers/misc/ocxl/context.c b/drivers/misc/ocxl/context.c index c73a859d2224..8b97b0f19db8 100644 --- a/drivers/misc/ocxl/context.c +++ b/drivers/misc/ocxl/context.c @@ -8,6 +8,7 @@ struct ocxl_context *ocxl_context_alloc(void) { return kzalloc(sizeof(struct ocxl_context), GFP_KERNEL); } +EXPORT_SYMBOL_GPL(ocxl_context_alloc); int ocxl_context_init(struct ocxl_context *ctx, struct ocxl_afu *afu, struct address_space *mapping) @@ -43,6 +44,7 @@ int ocxl_context_init(struct ocxl_context *ctx, struct ocxl_afu *afu, ocxl_afu_get(afu); return 0; } +EXPORT_SYMBOL_GPL(ocxl_context_init); /* * Callback for when a translation fault triggers an error @@ -63,7 +65,7 @@ static void xsl_fault_error(void *data, u64 addr, u64 dsisr) wake_up_all(>events_wq); } -int ocxl_context_attach(struct ocxl_context *ctx, u64 amr) +int ocxl_context_attach(struct ocxl_context *ctx, u64 amr, struct mm_struct *mm) { int rc; @@ -75,7 +77,7 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr) } rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid, - current->mm->context.id, ctx->tidr, amr, current->mm, + mm->context.id, ctx->tidr, amr, mm, xsl_fault_error, ctx); if (rc) goto out; @@ -85,6 +87,7 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr) mutex_unlock(>status_mutex); return rc; } +EXPORT_SYMBOL_GPL(ocxl_context_attach); static vm_fault_t map_afu_irq(struct vm_area_struct *vma, unsigned long address, u64 offset, struct ocxl_context *ctx) @@ -243,6 +246,7 @@ int ocxl_context_detach(struct ocxl_context *ctx) } return 0; } +EXPORT_SYMBOL_GPL(ocxl_context_detach); void ocxl_context_detach_all(struct ocxl_afu *afu) { @@ -280,3 +284,4 @@ void ocxl_context_free(struct ocxl_context *ctx) ocxl_afu_put(ctx->afu); kfree(ctx); } +EXPORT_SYMBOL_GPL(ocxl_context_free); diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index e6e6121cd9a3..e51578186fd4 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -94,7 +94,7 @@ static long afu_ioctl_attach(struct ocxl_context *ctx, return -EINVAL; amr = arg.amr & mfspr(SPRN_UAMOR); - rc = ocxl_context_attach(ctx, amr); + rc = ocxl_context_attach(ctx, amr, current->mm); return rc; } diff --git a/drivers/misc/ocxl/ocxl_internal.h b/drivers/misc/ocxl/ocxl_internal.h index e04e547df29e..cda1e7667fc8 100644 --- a/drivers/misc/ocxl/ocxl_internal.h +++ b/drivers/misc/ocxl/ocxl_internal.h @@ -130,15 +130,9 @@ int ocxl_config_check_afu_index(struct pci_dev *dev, */ int ocxl_link_update_pe(void *link_handle, int pasid, __u16 tid); -struct ocxl_context *ocxl_context_alloc(void); -int ocxl_context_init(struct ocxl_context *ctx, struct ocxl_afu *afu, - struct address_space *mapping); -int ocxl_context_attach(struct ocxl_context *ctx, u64 amr); int ocxl_context_mmap(struct ocxl_context *ctx, struct vm_area_struct *vma); -int ocxl_context_detach(struct ocxl_context *ctx); void ocxl_context_detach_all(struct ocxl_afu *afu); -void ocxl_context_free(struct ocxl_context *ctx); int ocxl_sysfs_register_afu(struct ocxl_afu *afu); void ocxl_sysfs_unregister_afu(struct ocxl_afu *afu); diff --git a/include/misc/ocxl.h b/include/misc/ocxl.h index 8bafd748e380..a8fe0ce4ea67 100644 --- a/include/misc/ocxl.h +++ b/include/misc/ocxl.h @@ -116,6 +116,51 @@ const struct ocxl_fn_config *ocxl_function_config(struct ocxl_fn *fn); */ void ocxl_function_close(struct ocxl_fn *fn); +// Context allocation + +/** + * Allocate space for a new OpenCAPI context + * + * Returns NULL on failure + */ +struct ocxl_context *ocxl_context_alloc(void); + +/** + * Initialize an OpenCAPI context + * + * @ctx: The OpenCAPI context to initialize + * @afu: The AFU the context belongs to + * @mapping: The mapping to unmap when the context is closed (may be NULL) + */ +int ocxl_context_init(struct ocxl_context *ctx, struct ocxl_afu *afu, + struct address_space *mapping); + +/** + * Free an OpenCAPI context + * + * @ctx: The
Re: [PATCH 1/5] ocxl: Rename struct link to ocxl_link
Le 27/02/2019 à 09:18, Andrew Donnellan a écrit : On 27/2/19 7:04 pm, Alastair D'Silva wrote: -Original Message- From: Andrew Donnellan Sent: Wednesday, 27 February 2019 6:55 PM To: Alastair D'Silva ; 'Alastair D'Silva' Cc: 'Greg Kurz' ; 'Frederic Barrat' ; 'Arnd Bergmann' ; 'Greg Kroah- Hartman' ; linuxppc-...@lists.ozlabs.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/5] ocxl: Rename struct link to ocxl_link On 27/2/19 6:34 pm, Alastair D'Silva wrote:>>> diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index e6a607488f8a..16eb8a60d5c7 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -152,7 +152,7 @@ static long afu_ioctl_enable_p9_wait(struct ocxl_context *ctx, if (status == ATTACHED) { int rc; - struct link *link = ctx->afu->fn->link; + void *link = ctx->afu->fn->link; This doesn't look like a rename... That corrects the type to what the member (and prototype for ocxl_link_update_pe) declare it as. The struct link there is bogus, it shouldn't even compile (since the intended struct link is defined in a different compilation unit), but instead picks up a different definition of 'struct link' from elsewhere. Given there's only a handful of struct links defined across the entire kernel, I'm going to guess that the definition it's picking up is in fact the ocxl one. Unlikely, since that's never in a header. It wasn't caught since it was assigned to/from a void*. Ah, yeah that'd explain it... and it's a pointer so it never needs to know its size. I'm clearly not very good at C. I think the better solution here is to move struct ocxl_link into ocxl_internal.h, change ocxl_fn::link to be struct ocxl_link * rather than void *, and update the function signature for ocxl_link_update_pe() as well. Not move it, but we could have an opaque declaration there. Putting it there would fit with all the other ocxl_* structs, but either way, we definitely need a declaration in there and get rid of the void*, t Mmm, it might turn out to be more invasive that planned... The point was only to have it as an opaque to the outside world, for APIs we'd like to deprecate at some point, so I wouldn't sweat too much over it. Fred
Re: [PATCH 2/5] ocxl: Clean up printf formats
Le 27/02/2019 à 05:57, Alastair D'Silva a écrit : From: Alastair D'Silva Use %# instead of using a literal '0x' Signed-off-by: Alastair D'Silva --- I don't really care either way, but it looks ok. Acked-by: Frederic Barrat drivers/misc/ocxl/config.c | 6 +++--- drivers/misc/ocxl/context.c | 2 +- drivers/misc/ocxl/trace.h | 10 +- 3 files changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c index 8f2c5d8bd2ee..0ee7856b033d 100644 --- a/drivers/misc/ocxl/config.c +++ b/drivers/misc/ocxl/config.c @@ -178,9 +178,9 @@ static int read_dvsec_vendor(struct pci_dev *dev) pci_read_config_dword(dev, pos + OCXL_DVSEC_VENDOR_DLX_VERS, ); dev_dbg(>dev, "Vendor specific DVSEC:\n"); - dev_dbg(>dev, " CFG version = 0x%x\n", cfg); - dev_dbg(>dev, " TLX version = 0x%x\n", tlx); - dev_dbg(>dev, " DLX version = 0x%x\n", dlx); + dev_dbg(>dev, " CFG version = %#x\n", cfg); + dev_dbg(>dev, " TLX version = %#x\n", tlx); + dev_dbg(>dev, " DLX version = %#x\n", dlx); return 0; } diff --git a/drivers/misc/ocxl/context.c b/drivers/misc/ocxl/context.c index c10a940e3b38..3498a0199bde 100644 --- a/drivers/misc/ocxl/context.c +++ b/drivers/misc/ocxl/context.c @@ -134,7 +134,7 @@ static vm_fault_t ocxl_mmap_fault(struct vm_fault *vmf) vm_fault_t ret; offset = vmf->pgoff << PAGE_SHIFT; - pr_debug("%s: pasid %d address 0x%lx offset 0x%llx\n", __func__, + pr_debug("%s: pasid %d address %#lx offset %#llx\n", __func__, ctx->pasid, vmf->address, offset); if (offset < ctx->afu->irq_base_offset) diff --git a/drivers/misc/ocxl/trace.h b/drivers/misc/ocxl/trace.h index bcb7ff330c1e..8d2f53812edd 100644 --- a/drivers/misc/ocxl/trace.h +++ b/drivers/misc/ocxl/trace.h @@ -28,7 +28,7 @@ DECLARE_EVENT_CLASS(ocxl_context, __entry->tidr = tidr; ), - TP_printk("linux pid=%d spa=0x%p pasid=0x%x pidr=0x%x tidr=0x%x", + TP_printk("linux pid=%d spa=%p pasid=%#x pidr=%#x tidr=%#x", __entry->pid, __entry->spa, __entry->pasid, @@ -61,7 +61,7 @@ TRACE_EVENT(ocxl_terminate_pasid, __entry->rc = rc; ), - TP_printk("pasid=0x%x rc=%d", + TP_printk("pasid=%#x rc=%d", __entry->pasid, __entry->rc ) @@ -87,7 +87,7 @@ DECLARE_EVENT_CLASS(ocxl_fault_handler, __entry->tfc = tfc; ), - TP_printk("spa=%p pe=0x%llx dsisr=0x%llx dar=0x%llx tfc=0x%llx", + TP_printk("spa=%p pe=%#llx dsisr=%#llx dar=%#llx tfc=%#llx", __entry->spa, __entry->pe, __entry->dsisr, @@ -127,7 +127,7 @@ TRACE_EVENT(ocxl_afu_irq_alloc, __entry->irq_offset = irq_offset; ), - TP_printk("pasid=0x%x irq_id=%d virq=%u hw_irq=%d irq_offset=0x%llx", + TP_printk("pasid=%#x irq_id=%d virq=%u hw_irq=%d irq_offset=0x%llx", __entry->pasid, __entry->irq_id, __entry->virq, @@ -150,7 +150,7 @@ TRACE_EVENT(ocxl_afu_irq_free, __entry->irq_id = irq_id; ), - TP_printk("pasid=0x%x irq_id=%d", + TP_printk("pasid=%#x irq_id=%d", __entry->pasid, __entry->irq_id )
Re: [PATCH 5/5] ocxl: Remove some unused exported symbols
Le 27/02/2019 à 05:57, Alastair D'Silva a écrit : From: Alastair D'Silva Remove some unused exported symbols. Signed-off-by: Alastair D'Silva --- If you have a respin of the series, that patch also adds a comment around ocxl_context_attach(), which is for later. But in any case: Acked-by: Frederic Barrat drivers/misc/ocxl/config.c| 2 -- drivers/misc/ocxl/ocxl_internal.h | 26 +- include/misc/ocxl.h | 23 --- 3 files changed, 25 insertions(+), 26 deletions(-) diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c index 026ac2ac4f9c..c90c2e4875bf 100644 --- a/drivers/misc/ocxl/config.c +++ b/drivers/misc/ocxl/config.c @@ -299,7 +299,6 @@ int ocxl_config_check_afu_index(struct pci_dev *dev, } return 1; } -EXPORT_SYMBOL_GPL(ocxl_config_check_afu_index); static int read_afu_name(struct pci_dev *dev, struct ocxl_fn_config *fn, struct ocxl_afu_config *afu) @@ -535,7 +534,6 @@ int ocxl_config_get_pasid_info(struct pci_dev *dev, int *count) { return pnv_ocxl_get_pasid_count(dev, count); } -EXPORT_SYMBOL_GPL(ocxl_config_get_pasid_info); void ocxl_config_set_afu_pasid(struct pci_dev *dev, int pos, int pasid_base, u32 pasid_count_log) diff --git a/drivers/misc/ocxl/ocxl_internal.h b/drivers/misc/ocxl/ocxl_internal.h index 321b29e77f45..cd5a1e3cc950 100644 --- a/drivers/misc/ocxl/ocxl_internal.h +++ b/drivers/misc/ocxl/ocxl_internal.h @@ -107,10 +107,34 @@ void ocxl_pasid_afu_free(struct ocxl_fn *fn, u32 start, u32 size); int ocxl_actag_afu_alloc(struct ocxl_fn *fn, u32 size); void ocxl_actag_afu_free(struct ocxl_fn *fn, u32 start, u32 size); +/* + * Get the max PASID value that can be used by the function + */ +int ocxl_config_get_pasid_info(struct pci_dev *dev, int *count); + +int ocxl_context_attach(struct ocxl_context *ctx, u64 amr); + +/* + * Check if an AFU index is valid for the given function. + * + * AFU indexes can be sparse, so a driver should check all indexes up + * to the maximum found in the function description + */ +int ocxl_config_check_afu_index(struct pci_dev *dev, + struct ocxl_fn_config *fn, int afu_idx); + +/** + * Update values within a Process Element + * + * link_handle: the link handle associated with the process element + * pasid: the PASID for the AFU context + * tid: the new thread id for the process element + */ +int ocxl_link_update_pe(void *link_handle, int pasid, __u16 tid); + struct ocxl_context *ocxl_context_alloc(void); int ocxl_context_init(struct ocxl_context *ctx, struct ocxl_afu *afu, struct address_space *mapping); -int ocxl_context_attach(struct ocxl_context *ctx, u64 amr); int ocxl_context_mmap(struct ocxl_context *ctx, struct vm_area_struct *vma); int ocxl_context_detach(struct ocxl_context *ctx); diff --git a/include/misc/ocxl.h b/include/misc/ocxl.h index 4544573cc93c..9530d3be1b30 100644 --- a/include/misc/ocxl.h +++ b/include/misc/ocxl.h @@ -56,15 +56,6 @@ struct ocxl_fn_config { int ocxl_config_read_function(struct pci_dev *dev, struct ocxl_fn_config *fn); -/* - * Check if an AFU index is valid for the given function. - * - * AFU indexes can be sparse, so a driver should check all indexes up - * to the maximum found in the function description - */ -int ocxl_config_check_afu_index(struct pci_dev *dev, - struct ocxl_fn_config *fn, int afu_idx); - /* * Read the configuration space of a function for the AFU specified by * the index 'afu_idx'. Fills in a ocxl_afu_config structure @@ -74,11 +65,6 @@ int ocxl_config_read_afu(struct pci_dev *dev, struct ocxl_afu_config *afu, u8 afu_idx); -/* - * Get the max PASID value that can be used by the function - */ -int ocxl_config_get_pasid_info(struct pci_dev *dev, int *count); - /* * Tell an AFU, by writing in the configuration space, the PASIDs that * it can use. Range starts at 'pasid_base' and its size is a multiple @@ -188,15 +174,6 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr, void (*xsl_err_cb)(void *data, u64 addr, u64 dsisr), void *xsl_err_data); -/** - * Update values within a Process Element - * - * link_handle: the link handle associated with the process element - * pasid: the PASID for the AFU context - * tid: the new thread id for the process element - */ -int ocxl_link_update_pe(void *link_handle, int pasid, __u16 tid); - /* * Remove a Process Element from the Shared Process Area for a link */
Re: [PATCH 4/5] ocxl: Remove superfluous 'extern' from headers
Le 27/02/2019 à 05:57, Alastair D'Silva a écrit : From: Alastair D'Silva The 'extern' keyword adds no value here. Signed-off-by: Alastair D'Silva --- Acked-by: Frederic Barrat drivers/misc/ocxl/ocxl_internal.h | 54 +++ include/misc/ocxl.h | 36 ++--- 2 files changed, 44 insertions(+), 46 deletions(-) diff --git a/drivers/misc/ocxl/ocxl_internal.h b/drivers/misc/ocxl/ocxl_internal.h index a32f2151029f..321b29e77f45 100644 --- a/drivers/misc/ocxl/ocxl_internal.h +++ b/drivers/misc/ocxl/ocxl_internal.h @@ -16,7 +16,6 @@ extern struct pci_driver ocxl_pci_driver; - struct ocxl_fn { struct device dev; int bar_used[3]; @@ -92,41 +91,40 @@ struct ocxl_process_element { __be32 software_state; }; +struct ocxl_afu *ocxl_afu_get(struct ocxl_afu *afu); +void ocxl_afu_put(struct ocxl_afu *afu); -extern struct ocxl_afu *ocxl_afu_get(struct ocxl_afu *afu); -extern void ocxl_afu_put(struct ocxl_afu *afu); - -extern int ocxl_create_cdev(struct ocxl_afu *afu); -extern void ocxl_destroy_cdev(struct ocxl_afu *afu); -extern int ocxl_register_afu(struct ocxl_afu *afu); -extern void ocxl_unregister_afu(struct ocxl_afu *afu); +int ocxl_create_cdev(struct ocxl_afu *afu); +void ocxl_destroy_cdev(struct ocxl_afu *afu); +int ocxl_register_afu(struct ocxl_afu *afu); +void ocxl_unregister_afu(struct ocxl_afu *afu); -extern int ocxl_file_init(void); -extern void ocxl_file_exit(void); +int ocxl_file_init(void); +void ocxl_file_exit(void); -extern int ocxl_pasid_afu_alloc(struct ocxl_fn *fn, u32 size); -extern void ocxl_pasid_afu_free(struct ocxl_fn *fn, u32 start, u32 size); -extern int ocxl_actag_afu_alloc(struct ocxl_fn *fn, u32 size); -extern void ocxl_actag_afu_free(struct ocxl_fn *fn, u32 start, u32 size); +int ocxl_pasid_afu_alloc(struct ocxl_fn *fn, u32 size); +void ocxl_pasid_afu_free(struct ocxl_fn *fn, u32 start, u32 size); +int ocxl_actag_afu_alloc(struct ocxl_fn *fn, u32 size); +void ocxl_actag_afu_free(struct ocxl_fn *fn, u32 start, u32 size); -extern struct ocxl_context *ocxl_context_alloc(void); -extern int ocxl_context_init(struct ocxl_context *ctx, struct ocxl_afu *afu, +struct ocxl_context *ocxl_context_alloc(void); +int ocxl_context_init(struct ocxl_context *ctx, struct ocxl_afu *afu, struct address_space *mapping); -extern int ocxl_context_attach(struct ocxl_context *ctx, u64 amr); -extern int ocxl_context_mmap(struct ocxl_context *ctx, +int ocxl_context_attach(struct ocxl_context *ctx, u64 amr); +int ocxl_context_mmap(struct ocxl_context *ctx, struct vm_area_struct *vma); -extern int ocxl_context_detach(struct ocxl_context *ctx); -extern void ocxl_context_detach_all(struct ocxl_afu *afu); -extern void ocxl_context_free(struct ocxl_context *ctx); +int ocxl_context_detach(struct ocxl_context *ctx); +void ocxl_context_detach_all(struct ocxl_afu *afu); +void ocxl_context_free(struct ocxl_context *ctx); -extern int ocxl_sysfs_add_afu(struct ocxl_afu *afu); -extern void ocxl_sysfs_remove_afu(struct ocxl_afu *afu); +int ocxl_sysfs_add_afu(struct ocxl_afu *afu); +void ocxl_sysfs_remove_afu(struct ocxl_afu *afu); -extern int ocxl_afu_irq_alloc(struct ocxl_context *ctx, u64 *irq_offset); -extern int ocxl_afu_irq_free(struct ocxl_context *ctx, u64 irq_offset); -extern void ocxl_afu_irq_free_all(struct ocxl_context *ctx); -extern int ocxl_afu_irq_set_fd(struct ocxl_context *ctx, u64 irq_offset, +int ocxl_afu_irq_alloc(struct ocxl_context *ctx, u64 *irq_offset); +int ocxl_afu_irq_free(struct ocxl_context *ctx, u64 irq_offset); +void ocxl_afu_irq_free_all(struct ocxl_context *ctx); +int ocxl_afu_irq_set_fd(struct ocxl_context *ctx, u64 irq_offset, int eventfd); -extern u64 ocxl_afu_irq_get_addr(struct ocxl_context *ctx, u64 irq_offset); +u64 ocxl_afu_irq_get_addr(struct ocxl_context *ctx, u64 irq_offset); #endif /* _OCXL_INTERNAL_H_ */ diff --git a/include/misc/ocxl.h b/include/misc/ocxl.h index 9ff6ddc28e22..4544573cc93c 100644 --- a/include/misc/ocxl.h +++ b/include/misc/ocxl.h @@ -53,7 +53,7 @@ struct ocxl_fn_config { * Read the configuration space of a function and fill in a * ocxl_fn_config structure with all the function details */ -extern int ocxl_config_read_function(struct pci_dev *dev, +int ocxl_config_read_function(struct pci_dev *dev, struct ocxl_fn_config *fn); /* @@ -62,14 +62,14 @@ extern int ocxl_config_read_function(struct pci_dev *dev, * AFU indexes can be sparse, so a driver should check all indexes up * to the maximum found in the function description */ -extern int ocxl_config_check_afu_index(struct pci_dev *dev, +int ocxl_config_check_afu_index(struct pci_dev *dev, struct ocxl_fn_config *fn, int afu_idx); /* * Read the configuration space of a function for the AFU specified by * the index
Re: [PATCH 3/5] ocxl: read_pasid never returns an error, so make it void
Le 27/02/2019 à 05:57, Alastair D'Silva a écrit : From: Alastair D'Silva No need for a return value in read_pasid as it only returns 0. Signed-off-by: Alastair D'Silva Reviewed-by: Greg Kurz --- Thanks! Acked-by: Frederic Barrat drivers/misc/ocxl/config.c | 9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c index 0ee7856b033d..026ac2ac4f9c 100644 --- a/drivers/misc/ocxl/config.c +++ b/drivers/misc/ocxl/config.c @@ -68,7 +68,7 @@ static int find_dvsec_afu_ctrl(struct pci_dev *dev, u8 afu_idx) return 0; } -static int read_pasid(struct pci_dev *dev, struct ocxl_fn_config *fn) +static void read_pasid(struct pci_dev *dev, struct ocxl_fn_config *fn) { u16 val; int pos; @@ -89,7 +89,6 @@ static int read_pasid(struct pci_dev *dev, struct ocxl_fn_config *fn) out: dev_dbg(>dev, "PASID capability:\n"); dev_dbg(>dev, " Max PASID log = %d\n", fn->max_pasid_log); - return 0; } static int read_dvsec_tl(struct pci_dev *dev, struct ocxl_fn_config *fn) @@ -205,11 +204,7 @@ int ocxl_config_read_function(struct pci_dev *dev, struct ocxl_fn_config *fn) { int rc; - rc = read_pasid(dev, fn); - if (rc) { - dev_err(>dev, "Invalid PASID configuration: %d\n", rc); - return -ENODEV; - } + read_pasid(dev, fn); rc = read_dvsec_tl(dev, fn); if (rc) {
Re: [PATCH v2 1/5] drivers/accel: Introduce subsystem
Le 27/01/2019 à 05:31, Andrew Donnellan a écrit : [+ linuxppc-dev, because cxl/ocxl are handled through powerpc - please cc on future versions of this series] On 26/1/19 8:13 am, Olof Johansson wrote: We're starting to see more of these kind of devices, the current upcoming wave will likely be around machine learning and inference engines. A few drivers have been added to drivers/misc for this, but it's timely to make it into a separate group of drivers/subsystem, to make it easier to find them, and to encourage collaboration between contributors. Over time, we expect to build shared frameworks that the drivers will make use of, but how that framework needs to look like to fill the needs is still unclear, and the best way to gain that knowledge is to give the disparate implementations a shared location. There has been some controversy around expectations for userspace stacks being open. The clear preference is to see that happen, and any driver and platform stack that is delivered like that will be given preferential treatment, and at some point in the future it might become the requirement. Until then, the bare minimum we need is an open low-level userspace such that the driver and HW interfaces can be exercised if someone is modifying the driver, even if the full details of the workload are not always available. Bootstrapping this with myself and Greg as maintainers (since the current drivers will be moving out of drivers/misc). Looking forward to expanding that group over time. [snip] + +Hardware offload accelerator subsystem +== + +This is a brief overview of the subsystem (grouping) of hardware +accelerators kept under drivers/accel + +Types of hardware supported +--- + + The general types of hardware supported are hardware devices that has + general interactions of sending commands and buffers to the hardware, + returning completions and possible filled buffers back, together + with the usual driver pieces around hardware control, setup, error + handling, etc. + + Drivers that fit into other subsystems are expected to be merged + there, and use the appropriate userspace interfaces of said functional + areas. We don't expect to see drivers for network, storage, graphics + and similar hardware implemented by drivers here. + +Expectations for contributions +-- + + - Platforms and hardware that has fully open stacks, from Firmware to + Userspace, are always going to be given preferential treatment. These + platforms give the best insight for behavior and interaction of all + layers, including ability to improve implementation across the stack + over time. + + - If a platform is partially proprietary, it is still expected that the + portions that interact the driver can be shared in a form that allows + for exercising the hardware/driver and evolution of the interface over + time. This could be separated into a shared library and test/sample + programs, for example. + + - Over time, there is an expectation to converge drivers over to shared + frameworks and interfaces. Until then, the general rule is that no + more than one driver per vendor will be acceptable. For vendors that + aren't participating in the work towards shared frameworks over time, + we reserve the right to phase out support for the hardware. How exactly do generic drivers for interconnect protocols, such as cxl/ocxl, fit in here? cxl and ocxl are not drivers for a specific device, they are generic drivers which can be used with any device implementing the CAPI or OpenCAPI protocol respectively - many of which will be FPGA boards flashed with customer-designed accelerator cores for specific workloads, some will be accelerators using ASICs or using FPGA images supplied by vendors, some will be driven from userspace, others using the cxl/ocxl kernel API, etc. I have the same reservation as Andrew. While my first reaction was to think that cxl and ocxl should be part of the accel subsystem, they hardly seem to fit the stated goals. Furthermore, there are implications there, as all the distros currently shipping cxl and ocxl as modules on powerpc would need to have their config modified to enable CONFIG_ACCEL. Fred
Re: [PATCH v2] misc: cxl: Use device_type helpers to access the node type
Le 05/12/2018 à 20:16, Rob Herring a écrit : Remove directly accessing device_type property and use the of_node_is_type accessor instead. While not using it here, this is part of eventually removing the struct device_node.type pointer. Cc: Frederic Barrat Cc: Arnd Bergmann Cc: Greg Kroah-Hartman Cc: linuxppc-...@lists.ozlabs.org Acked-by: Andrew Donnellan Signed-off-by: Rob Herring --- v2: - Reword commit message as this change was using the .type ptr. Acked-by: Frederic Barrat drivers/misc/cxl/pci.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index b66d832d3233..c79ba1c699ad 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1718,7 +1718,6 @@ int cxl_slot_is_switched(struct pci_dev *dev) { struct device_node *np; int depth = 0; - const __be32 *prop; if (!(np = pci_device_to_OF_node(dev))) { pr_err("cxl: np = NULL\n"); @@ -1727,8 +1726,7 @@ int cxl_slot_is_switched(struct pci_dev *dev) of_node_get(np); while (np) { np = of_get_next_parent(np); - prop = of_get_property(np, "device_type", NULL); - if (!prop || strcmp((char *)prop, "pciex")) + if (!of_node_is_type(np, "pciex")) break; depth++; }
Re: [RESEND PATCHv2] misc: cxl: Fix possible null pointer dereference
Le 04/10/2018 à 07:02, zhong jiang a écrit : It is not safe to dereference an object before a null test. It is not needed and just remove them. Ftrace can be used instead. Signed-off-by: zhong jiang --- Acked-by: Frederic Barrat drivers/misc/cxl/guest.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/misc/cxl/guest.c b/drivers/misc/cxl/guest.c index 3bc0c15..5d28d9e 100644 --- a/drivers/misc/cxl/guest.c +++ b/drivers/misc/cxl/guest.c @@ -1018,8 +1018,6 @@ int cxl_guest_init_afu(struct cxl *adapter, int slice, struct device_node *afu_n void cxl_guest_remove_afu(struct cxl_afu *afu) { - pr_devel("in %s - AFU(%d)\n", __func__, afu->slice); - if (!afu) return;
Re: [RESEND PATCHv2] misc: cxl: Fix possible null pointer dereference
Le 04/10/2018 à 07:02, zhong jiang a écrit : It is not safe to dereference an object before a null test. It is not needed and just remove them. Ftrace can be used instead. Signed-off-by: zhong jiang --- Acked-by: Frederic Barrat drivers/misc/cxl/guest.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/misc/cxl/guest.c b/drivers/misc/cxl/guest.c index 3bc0c15..5d28d9e 100644 --- a/drivers/misc/cxl/guest.c +++ b/drivers/misc/cxl/guest.c @@ -1018,8 +1018,6 @@ int cxl_guest_init_afu(struct cxl *adapter, int slice, struct device_node *afu_n void cxl_guest_remove_afu(struct cxl_afu *afu) { - pr_devel("in %s - AFU(%d)\n", __func__, afu->slice); - if (!afu) return;
Re: [PATCH -next] ocxl: Fix missing unlock on error in afu_ioctl_enable_p9_wait()
Le 05/06/2018 à 11:16, Wei Yongjun a écrit : Add the missing unlock before return from function afu_ioctl_enable_p9_wait() in the error handling case. Fixes: e948e06fc63a ("ocxl: Expose the thread_id needed for wait on POWER9") Signed-off-by: Wei Yongjun --- drivers/misc/ocxl/file.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index 33ae46c..e6a6074 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -139,8 +139,10 @@ static long afu_ioctl_enable_p9_wait(struct ocxl_context *ctx, // Locks both status & tidr mutex_lock(>status_mutex); if (!ctx->tidr) { - if (set_thread_tidr(current)) + if (set_thread_tidr(current)) { + mutex_unlock(>status_mutex); return -ENOENT; + } O_o Thanks for fixing it Acked-by: Frederic Barrat ctx->tidr = current->thread.tidr; }
Re: [PATCH -next] ocxl: Fix missing unlock on error in afu_ioctl_enable_p9_wait()
Le 05/06/2018 à 11:16, Wei Yongjun a écrit : Add the missing unlock before return from function afu_ioctl_enable_p9_wait() in the error handling case. Fixes: e948e06fc63a ("ocxl: Expose the thread_id needed for wait on POWER9") Signed-off-by: Wei Yongjun --- drivers/misc/ocxl/file.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index 33ae46c..e6a6074 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -139,8 +139,10 @@ static long afu_ioctl_enable_p9_wait(struct ocxl_context *ctx, // Locks both status & tidr mutex_lock(>status_mutex); if (!ctx->tidr) { - if (set_thread_tidr(current)) + if (set_thread_tidr(current)) { + mutex_unlock(>status_mutex); return -ENOENT; + } O_o Thanks for fixing it Acked-by: Frederic Barrat ctx->tidr = current->thread.tidr; }
Re: [PATCH v5 5/7] ocxl: Expose the thread_id needed for wait on POWER9
Le 11/05/2018 à 12:06, Alastair D'Silva a écrit : -Original Message- From: Frederic Barrat <fbar...@linux.ibm.com> Sent: Friday, 11 May 2018 7:25 PM To: Alastair D'Silva <alast...@au1.ibm.com>; linuxppc-...@lists.ozlabs.org Cc: linux-kernel@vger.kernel.org; linux-...@vger.kernel.org; mi...@neuling.org; vaib...@linux.vnet.ibm.com; aneesh.ku...@linux.vnet.ibm.com; ma...@debian.org; fe...@linux.vnet.ibm.com; pombreda...@nexb.com; suka...@linux.vnet.ibm.com; npig...@gmail.com; gre...@linuxfoundation.org; a...@arndb.de; andrew.donnel...@au1.ibm.com; fbar...@linux.vnet.ibm.com; cor...@lwn.net; Alastair D'Silva <alast...@d-silva.org> Subject: Re: [PATCH v5 5/7] ocxl: Expose the thread_id needed for wait on POWER9 Le 11/05/2018 à 08:13, Alastair D'Silva a écrit : From: Alastair D'Silva <alast...@d-silva.org> In order to successfully issue as_notify, an AFU needs to know the TID to notify, which in turn means that this information should be available in userspace so it can be communicated to the AFU. Signed-off-by: Alastair D'Silva <alast...@d-silva.org> --- Ok, so we keep the limitation of having only one thread per context able to call 'wait', even though we don't have to worry about depleting the pool of TIDs any more. I think that's acceptable, though we don't really have a reason to justify it any more. Any reason you want to keep it that way? No strong reason, just trying to minimise the amount of changes. We can always expand the scope later, if we have a use-case for it. ok. Agreed, it's not worth holding up this series, we can send a follow up patch. Fred Fred drivers/misc/ocxl/context.c | 5 ++- drivers/misc/ocxl/file.c | 53 +++ drivers/misc/ocxl/link.c | 36 + drivers/misc/ocxl/ocxl_internal.h | 1 + include/misc/ocxl.h | 9 ++ include/uapi/misc/ocxl.h | 8 + 6 files changed, 111 insertions(+), 1 deletion(-) diff --git a/drivers/misc/ocxl/context.c b/drivers/misc/ocxl/context.c index 909e8807824a..95f74623113e 100644 --- a/drivers/misc/ocxl/context.c +++ b/drivers/misc/ocxl/context.c @@ -34,6 +34,8 @@ int ocxl_context_init(struct ocxl_context *ctx, struct ocxl_afu *afu, mutex_init(>xsl_error_lock); mutex_init(>irq_lock); idr_init(>irq_idr); + ctx->tidr = 0; + /* * Keep a reference on the AFU to make sure it's valid for the * duration of the life of the context @@ -65,6 +67,7 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr) { int rc; + // Locks both status & tidr mutex_lock(>status_mutex); if (ctx->status != OPENED) { rc = -EIO; @@ -72,7 +75,7 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr) } rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid, - current->mm->context.id, 0, amr, current->mm, + current->mm->context.id, ctx->tidr, amr, current- mm, xsl_fault_error, ctx); if (rc) goto out; diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index 038509e5d031..eb409a469f21 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -5,6 +5,8 @@ #include #include #include +#include +#include #include "ocxl_internal.h" @@ -123,11 +125,55 @@ static long afu_ioctl_get_metadata(struct ocxl_context *ctx, return 0; } +#ifdef CONFIG_PPC64 +static long afu_ioctl_enable_p9_wait(struct ocxl_context *ctx, + struct ocxl_ioctl_p9_wait __user *uarg) { + struct ocxl_ioctl_p9_wait arg; + + memset(, 0, sizeof(arg)); + + if (cpu_has_feature(CPU_FTR_P9_TIDR)) { + enum ocxl_context_status status; + + // Locks both status & tidr + mutex_lock(>status_mutex); + if (!ctx->tidr) { + if (set_thread_tidr(current)) + return -ENOENT; + + ctx->tidr = current->thread.tidr; + } + + status = ctx->status; + mutex_unlock(>status_mutex); + + if (status == ATTACHED) { + int rc; + struct link *link = ctx->afu->fn->link; + + rc = ocxl_link_update_pe(link, ctx->pasid, ctx->tidr); + if (rc) + return rc; + } + + arg.thread_id = ctx->tidr; + } else + return -ENOENT; + + if (copy_to_user(uarg, , sizeof(arg))) + return -EFAULT; + + return 0; +} +#endif + #define CMD_STR(x) (x == OCXL_IOCTL_ATTACH ? "ATTACH" :
Re: [PATCH v5 5/7] ocxl: Expose the thread_id needed for wait on POWER9
Le 11/05/2018 à 12:06, Alastair D'Silva a écrit : -Original Message- From: Frederic Barrat Sent: Friday, 11 May 2018 7:25 PM To: Alastair D'Silva ; linuxppc-...@lists.ozlabs.org Cc: linux-kernel@vger.kernel.org; linux-...@vger.kernel.org; mi...@neuling.org; vaib...@linux.vnet.ibm.com; aneesh.ku...@linux.vnet.ibm.com; ma...@debian.org; fe...@linux.vnet.ibm.com; pombreda...@nexb.com; suka...@linux.vnet.ibm.com; npig...@gmail.com; gre...@linuxfoundation.org; a...@arndb.de; andrew.donnel...@au1.ibm.com; fbar...@linux.vnet.ibm.com; cor...@lwn.net; Alastair D'Silva Subject: Re: [PATCH v5 5/7] ocxl: Expose the thread_id needed for wait on POWER9 Le 11/05/2018 à 08:13, Alastair D'Silva a écrit : From: Alastair D'Silva In order to successfully issue as_notify, an AFU needs to know the TID to notify, which in turn means that this information should be available in userspace so it can be communicated to the AFU. Signed-off-by: Alastair D'Silva --- Ok, so we keep the limitation of having only one thread per context able to call 'wait', even though we don't have to worry about depleting the pool of TIDs any more. I think that's acceptable, though we don't really have a reason to justify it any more. Any reason you want to keep it that way? No strong reason, just trying to minimise the amount of changes. We can always expand the scope later, if we have a use-case for it. ok. Agreed, it's not worth holding up this series, we can send a follow up patch. Fred Fred drivers/misc/ocxl/context.c | 5 ++- drivers/misc/ocxl/file.c | 53 +++ drivers/misc/ocxl/link.c | 36 + drivers/misc/ocxl/ocxl_internal.h | 1 + include/misc/ocxl.h | 9 ++ include/uapi/misc/ocxl.h | 8 + 6 files changed, 111 insertions(+), 1 deletion(-) diff --git a/drivers/misc/ocxl/context.c b/drivers/misc/ocxl/context.c index 909e8807824a..95f74623113e 100644 --- a/drivers/misc/ocxl/context.c +++ b/drivers/misc/ocxl/context.c @@ -34,6 +34,8 @@ int ocxl_context_init(struct ocxl_context *ctx, struct ocxl_afu *afu, mutex_init(>xsl_error_lock); mutex_init(>irq_lock); idr_init(>irq_idr); + ctx->tidr = 0; + /* * Keep a reference on the AFU to make sure it's valid for the * duration of the life of the context @@ -65,6 +67,7 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr) { int rc; + // Locks both status & tidr mutex_lock(>status_mutex); if (ctx->status != OPENED) { rc = -EIO; @@ -72,7 +75,7 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr) } rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid, - current->mm->context.id, 0, amr, current->mm, + current->mm->context.id, ctx->tidr, amr, current- mm, xsl_fault_error, ctx); if (rc) goto out; diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index 038509e5d031..eb409a469f21 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -5,6 +5,8 @@ #include #include #include +#include +#include #include "ocxl_internal.h" @@ -123,11 +125,55 @@ static long afu_ioctl_get_metadata(struct ocxl_context *ctx, return 0; } +#ifdef CONFIG_PPC64 +static long afu_ioctl_enable_p9_wait(struct ocxl_context *ctx, + struct ocxl_ioctl_p9_wait __user *uarg) { + struct ocxl_ioctl_p9_wait arg; + + memset(, 0, sizeof(arg)); + + if (cpu_has_feature(CPU_FTR_P9_TIDR)) { + enum ocxl_context_status status; + + // Locks both status & tidr + mutex_lock(>status_mutex); + if (!ctx->tidr) { + if (set_thread_tidr(current)) + return -ENOENT; + + ctx->tidr = current->thread.tidr; + } + + status = ctx->status; + mutex_unlock(>status_mutex); + + if (status == ATTACHED) { + int rc; + struct link *link = ctx->afu->fn->link; + + rc = ocxl_link_update_pe(link, ctx->pasid, ctx->tidr); + if (rc) + return rc; + } + + arg.thread_id = ctx->tidr; + } else + return -ENOENT; + + if (copy_to_user(uarg, , sizeof(arg))) + return -EFAULT; + + return 0; +} +#endif + #define CMD_STR(x) (x == OCXL_IOCTL_ATTACH ? "ATTACH" : \ x == OCXL_IOCTL_IRQ_ALLOC ? "IRQ_ALLOC" : \ x == OCXL_IOCTL_IRQ_FREE ? "IR
Re: [PATCH] misc: cxl: Change return type to vm_fault_t
Le 17/04/2018 à 16:53, Souptick Joarder a écrit : Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Reference id -> 1c8f422059ae ("mm: change return type to vm_fault_t") previously cxl_mmap_fault returns VM_FAULT_NOPAGE as default value irrespective of vm_insert_pfn() return value. This bug is fixed with new vmf_insert_pfn() which will return VM_FAULT_ type based on err. Signed-off-by: Souptick Joarder <jrdr.li...@gmail.com> --- It looks ok, and it passed some basic testing. Acked-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> Fred drivers/misc/cxl/context.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/misc/cxl/context.c b/drivers/misc/cxl/context.c index 7ff315a..c6ec872 100644 --- a/drivers/misc/cxl/context.c +++ b/drivers/misc/cxl/context.c @@ -128,11 +128,12 @@ void cxl_context_set_mapping(struct cxl_context *ctx, mutex_unlock(>mapping_lock); } -static int cxl_mmap_fault(struct vm_fault *vmf) +static vm_fault_t cxl_mmap_fault(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; struct cxl_context *ctx = vma->vm_file->private_data; u64 area, offset; + vm_fault_t ret; offset = vmf->pgoff << PAGE_SHIFT; @@ -169,11 +170,11 @@ static int cxl_mmap_fault(struct vm_fault *vmf) return VM_FAULT_SIGBUS; } - vm_insert_pfn(vma, vmf->address, (area + offset) >> PAGE_SHIFT); + ret = vmf_insert_pfn(vma, vmf->address, (area + offset) >> PAGE_SHIFT); mutex_unlock(>status_mutex); - return VM_FAULT_NOPAGE; + return ret; } static const struct vm_operations_struct cxl_mmap_vmops = { -- 1.9.1
Re: [PATCH] misc: cxl: Change return type to vm_fault_t
Le 17/04/2018 à 16:53, Souptick Joarder a écrit : Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Reference id -> 1c8f422059ae ("mm: change return type to vm_fault_t") previously cxl_mmap_fault returns VM_FAULT_NOPAGE as default value irrespective of vm_insert_pfn() return value. This bug is fixed with new vmf_insert_pfn() which will return VM_FAULT_ type based on err. Signed-off-by: Souptick Joarder --- It looks ok, and it passed some basic testing. Acked-by: Frederic Barrat Fred drivers/misc/cxl/context.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/misc/cxl/context.c b/drivers/misc/cxl/context.c index 7ff315a..c6ec872 100644 --- a/drivers/misc/cxl/context.c +++ b/drivers/misc/cxl/context.c @@ -128,11 +128,12 @@ void cxl_context_set_mapping(struct cxl_context *ctx, mutex_unlock(>mapping_lock); } -static int cxl_mmap_fault(struct vm_fault *vmf) +static vm_fault_t cxl_mmap_fault(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; struct cxl_context *ctx = vma->vm_file->private_data; u64 area, offset; + vm_fault_t ret; offset = vmf->pgoff << PAGE_SHIFT; @@ -169,11 +170,11 @@ static int cxl_mmap_fault(struct vm_fault *vmf) return VM_FAULT_SIGBUS; } - vm_insert_pfn(vma, vmf->address, (area + offset) >> PAGE_SHIFT); + ret = vmf_insert_pfn(vma, vmf->address, (area + offset) >> PAGE_SHIFT); mutex_unlock(>status_mutex); - return VM_FAULT_NOPAGE; + return ret; } static const struct vm_operations_struct cxl_mmap_vmops = { -- 1.9.1
Re: [PATCH v5 7/7] ocxl: Document new OCXL IOCTLs
Le 11/05/2018 à 08:13, Alastair D'Silva a écrit : From: Alastair D'Silva <alast...@d-silva.org> Signed-off-by: Alastair D'Silva <alast...@d-silva.org> --- Acked-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> Documentation/accelerators/ocxl.rst | 11 +++ 1 file changed, 11 insertions(+) diff --git a/Documentation/accelerators/ocxl.rst b/Documentation/accelerators/ocxl.rst index ddcc58d01cfb..14cefc020e2d 100644 --- a/Documentation/accelerators/ocxl.rst +++ b/Documentation/accelerators/ocxl.rst @@ -157,6 +157,17 @@ OCXL_IOCTL_GET_METADATA: Obtains configuration information from the card, such at the size of MMIO areas, the AFU version, and the PASID for the current context. +OCXL_IOCTL_ENABLE_P9_WAIT: + + Allows the AFU to wake a userspace thread executing 'wait'. Returns + information to userspace to allow it to configure the AFU. Note that + this is only available on POWER9. + +OCXL_IOCTL_GET_FEATURES: + + Reports on which CPU features that affect OpenCAPI are usable from + userspace. + mmap
Re: [PATCH v5 7/7] ocxl: Document new OCXL IOCTLs
Le 11/05/2018 à 08:13, Alastair D'Silva a écrit : From: Alastair D'Silva Signed-off-by: Alastair D'Silva --- Acked-by: Frederic Barrat Documentation/accelerators/ocxl.rst | 11 +++ 1 file changed, 11 insertions(+) diff --git a/Documentation/accelerators/ocxl.rst b/Documentation/accelerators/ocxl.rst index ddcc58d01cfb..14cefc020e2d 100644 --- a/Documentation/accelerators/ocxl.rst +++ b/Documentation/accelerators/ocxl.rst @@ -157,6 +157,17 @@ OCXL_IOCTL_GET_METADATA: Obtains configuration information from the card, such at the size of MMIO areas, the AFU version, and the PASID for the current context. +OCXL_IOCTL_ENABLE_P9_WAIT: + + Allows the AFU to wake a userspace thread executing 'wait'. Returns + information to userspace to allow it to configure the AFU. Note that + this is only available on POWER9. + +OCXL_IOCTL_GET_FEATURES: + + Reports on which CPU features that affect OpenCAPI are usable from + userspace. + mmap
Re: [PATCH v5 6/7] ocxl: Add an IOCTL so userspace knows what OCXL features are available
Le 11/05/2018 à 08:13, Alastair D'Silva a écrit : From: Alastair D'Silva <alast...@d-silva.org> In order for a userspace AFU driver to call the POWER9 specific OCXL_IOCTL_ENABLE_P9_WAIT, it needs to verify that it can actually make that call. Signed-off-by: Alastair D'Silva <alast...@d-silva.org> --- Acked-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> drivers/misc/ocxl/file.c | 25 + include/uapi/misc/ocxl.h | 6 ++ 2 files changed, 31 insertions(+) diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index eb409a469f21..33ae46ce0a8a 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -168,12 +168,32 @@ static long afu_ioctl_enable_p9_wait(struct ocxl_context *ctx, } #endif + +static long afu_ioctl_get_features(struct ocxl_context *ctx, + struct ocxl_ioctl_features __user *uarg) +{ + struct ocxl_ioctl_features arg; + + memset(, 0, sizeof(arg)); + +#ifdef CONFIG_PPC64 + if (cpu_has_feature(CPU_FTR_P9_TIDR)) + arg.flags[0] |= OCXL_IOCTL_FEATURES_FLAGS0_P9_WAIT; +#endif + + if (copy_to_user(uarg, , sizeof(arg))) + return -EFAULT; + + return 0; +} + #define CMD_STR(x) (x == OCXL_IOCTL_ATTACH ? "ATTACH" : \ x == OCXL_IOCTL_IRQ_ALLOC ? "IRQ_ALLOC" : \ x == OCXL_IOCTL_IRQ_FREE ? "IRQ_FREE" : \ x == OCXL_IOCTL_IRQ_SET_FD ? "IRQ_SET_FD" : \ x == OCXL_IOCTL_GET_METADATA ? "GET_METADATA" : \ x == OCXL_IOCTL_ENABLE_P9_WAIT ? "ENABLE_P9_WAIT" : \ + x == OCXL_IOCTL_GET_FEATURES ? "GET_FEATURES" : \ "UNKNOWN") static long afu_ioctl(struct file *file, unsigned int cmd, @@ -239,6 +259,11 @@ static long afu_ioctl(struct file *file, unsigned int cmd, break; #endif + case OCXL_IOCTL_GET_FEATURES: + rc = afu_ioctl_get_features(ctx, + (struct ocxl_ioctl_features __user *) args); + break; + default: rc = -EINVAL; } diff --git a/include/uapi/misc/ocxl.h b/include/uapi/misc/ocxl.h index 561e6f0dfcb7..97937cfa3baa 100644 --- a/include/uapi/misc/ocxl.h +++ b/include/uapi/misc/ocxl.h @@ -55,6 +55,11 @@ struct ocxl_ioctl_p9_wait { __u64 reserved3[3]; }; +#define OCXL_IOCTL_FEATURES_FLAGS0_P9_WAIT 0x01 +struct ocxl_ioctl_features { + __u64 flags[4]; +}; + struct ocxl_ioctl_irq_fd { __u64 irq_offset; __s32 eventfd; @@ -70,5 +75,6 @@ struct ocxl_ioctl_irq_fd { #define OCXL_IOCTL_IRQ_SET_FD _IOW(OCXL_MAGIC, 0x13, struct ocxl_ioctl_irq_fd) #define OCXL_IOCTL_GET_METADATA _IOR(OCXL_MAGIC, 0x14, struct ocxl_ioctl_metadata) #define OCXL_IOCTL_ENABLE_P9_WAIT _IOR(OCXL_MAGIC, 0x15, struct ocxl_ioctl_p9_wait) +#define OCXL_IOCTL_GET_FEATURES _IOR(OCXL_MAGIC, 0x16, struct ocxl_ioctl_features) #endif /* _UAPI_MISC_OCXL_H */
Re: [PATCH v5 6/7] ocxl: Add an IOCTL so userspace knows what OCXL features are available
Le 11/05/2018 à 08:13, Alastair D'Silva a écrit : From: Alastair D'Silva In order for a userspace AFU driver to call the POWER9 specific OCXL_IOCTL_ENABLE_P9_WAIT, it needs to verify that it can actually make that call. Signed-off-by: Alastair D'Silva --- Acked-by: Frederic Barrat drivers/misc/ocxl/file.c | 25 + include/uapi/misc/ocxl.h | 6 ++ 2 files changed, 31 insertions(+) diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index eb409a469f21..33ae46ce0a8a 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -168,12 +168,32 @@ static long afu_ioctl_enable_p9_wait(struct ocxl_context *ctx, } #endif + +static long afu_ioctl_get_features(struct ocxl_context *ctx, + struct ocxl_ioctl_features __user *uarg) +{ + struct ocxl_ioctl_features arg; + + memset(, 0, sizeof(arg)); + +#ifdef CONFIG_PPC64 + if (cpu_has_feature(CPU_FTR_P9_TIDR)) + arg.flags[0] |= OCXL_IOCTL_FEATURES_FLAGS0_P9_WAIT; +#endif + + if (copy_to_user(uarg, , sizeof(arg))) + return -EFAULT; + + return 0; +} + #define CMD_STR(x) (x == OCXL_IOCTL_ATTACH ? "ATTACH" : \ x == OCXL_IOCTL_IRQ_ALLOC ? "IRQ_ALLOC" : \ x == OCXL_IOCTL_IRQ_FREE ? "IRQ_FREE" : \ x == OCXL_IOCTL_IRQ_SET_FD ? "IRQ_SET_FD" : \ x == OCXL_IOCTL_GET_METADATA ? "GET_METADATA" : \ x == OCXL_IOCTL_ENABLE_P9_WAIT ? "ENABLE_P9_WAIT" : \ + x == OCXL_IOCTL_GET_FEATURES ? "GET_FEATURES" : \ "UNKNOWN") static long afu_ioctl(struct file *file, unsigned int cmd, @@ -239,6 +259,11 @@ static long afu_ioctl(struct file *file, unsigned int cmd, break; #endif + case OCXL_IOCTL_GET_FEATURES: + rc = afu_ioctl_get_features(ctx, + (struct ocxl_ioctl_features __user *) args); + break; + default: rc = -EINVAL; } diff --git a/include/uapi/misc/ocxl.h b/include/uapi/misc/ocxl.h index 561e6f0dfcb7..97937cfa3baa 100644 --- a/include/uapi/misc/ocxl.h +++ b/include/uapi/misc/ocxl.h @@ -55,6 +55,11 @@ struct ocxl_ioctl_p9_wait { __u64 reserved3[3]; }; +#define OCXL_IOCTL_FEATURES_FLAGS0_P9_WAIT 0x01 +struct ocxl_ioctl_features { + __u64 flags[4]; +}; + struct ocxl_ioctl_irq_fd { __u64 irq_offset; __s32 eventfd; @@ -70,5 +75,6 @@ struct ocxl_ioctl_irq_fd { #define OCXL_IOCTL_IRQ_SET_FD _IOW(OCXL_MAGIC, 0x13, struct ocxl_ioctl_irq_fd) #define OCXL_IOCTL_GET_METADATA _IOR(OCXL_MAGIC, 0x14, struct ocxl_ioctl_metadata) #define OCXL_IOCTL_ENABLE_P9_WAIT _IOR(OCXL_MAGIC, 0x15, struct ocxl_ioctl_p9_wait) +#define OCXL_IOCTL_GET_FEATURES _IOR(OCXL_MAGIC, 0x16, struct ocxl_ioctl_features) #endif /* _UAPI_MISC_OCXL_H */
Re: [PATCH v5 5/7] ocxl: Expose the thread_id needed for wait on POWER9
Le 11/05/2018 à 08:13, Alastair D'Silva a écrit : From: Alastair D'SilvaIn order to successfully issue as_notify, an AFU needs to know the TID to notify, which in turn means that this information should be available in userspace so it can be communicated to the AFU. Signed-off-by: Alastair D'Silva --- Ok, so we keep the limitation of having only one thread per context able to call 'wait', even though we don't have to worry about depleting the pool of TIDs any more. I think that's acceptable, though we don't really have a reason to justify it any more. Any reason you want to keep it that way? Fred drivers/misc/ocxl/context.c | 5 ++- drivers/misc/ocxl/file.c | 53 +++ drivers/misc/ocxl/link.c | 36 + drivers/misc/ocxl/ocxl_internal.h | 1 + include/misc/ocxl.h | 9 ++ include/uapi/misc/ocxl.h | 8 + 6 files changed, 111 insertions(+), 1 deletion(-) diff --git a/drivers/misc/ocxl/context.c b/drivers/misc/ocxl/context.c index 909e8807824a..95f74623113e 100644 --- a/drivers/misc/ocxl/context.c +++ b/drivers/misc/ocxl/context.c @@ -34,6 +34,8 @@ int ocxl_context_init(struct ocxl_context *ctx, struct ocxl_afu *afu, mutex_init(>xsl_error_lock); mutex_init(>irq_lock); idr_init(>irq_idr); + ctx->tidr = 0; + /* * Keep a reference on the AFU to make sure it's valid for the * duration of the life of the context @@ -65,6 +67,7 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr) { int rc; + // Locks both status & tidr mutex_lock(>status_mutex); if (ctx->status != OPENED) { rc = -EIO; @@ -72,7 +75,7 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr) } rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid, - current->mm->context.id, 0, amr, current->mm, + current->mm->context.id, ctx->tidr, amr, current->mm, xsl_fault_error, ctx); if (rc) goto out; diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index 038509e5d031..eb409a469f21 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -5,6 +5,8 @@ #include #include #include +#include +#include #include "ocxl_internal.h" @@ -123,11 +125,55 @@ static long afu_ioctl_get_metadata(struct ocxl_context *ctx, return 0; } +#ifdef CONFIG_PPC64 +static long afu_ioctl_enable_p9_wait(struct ocxl_context *ctx, + struct ocxl_ioctl_p9_wait __user *uarg) +{ + struct ocxl_ioctl_p9_wait arg; + + memset(, 0, sizeof(arg)); + + if (cpu_has_feature(CPU_FTR_P9_TIDR)) { + enum ocxl_context_status status; + + // Locks both status & tidr + mutex_lock(>status_mutex); + if (!ctx->tidr) { + if (set_thread_tidr(current)) + return -ENOENT; + + ctx->tidr = current->thread.tidr; + } + + status = ctx->status; + mutex_unlock(>status_mutex); + + if (status == ATTACHED) { + int rc; + struct link *link = ctx->afu->fn->link; + + rc = ocxl_link_update_pe(link, ctx->pasid, ctx->tidr); + if (rc) + return rc; + } + + arg.thread_id = ctx->tidr; + } else + return -ENOENT; + + if (copy_to_user(uarg, , sizeof(arg))) + return -EFAULT; + + return 0; +} +#endif + #define CMD_STR(x) (x == OCXL_IOCTL_ATTACH ? "ATTACH" : \ x == OCXL_IOCTL_IRQ_ALLOC ? "IRQ_ALLOC" : \ x == OCXL_IOCTL_IRQ_FREE ? "IRQ_FREE" : \ x == OCXL_IOCTL_IRQ_SET_FD ? "IRQ_SET_FD" : \ x == OCXL_IOCTL_GET_METADATA ? "GET_METADATA" : \ + x == OCXL_IOCTL_ENABLE_P9_WAIT ? "ENABLE_P9_WAIT" : \ "UNKNOWN") static long afu_ioctl(struct file *file, unsigned int cmd, @@ -186,6 +232,13 @@ static long afu_ioctl(struct file *file, unsigned int cmd, (struct ocxl_ioctl_metadata __user *) args); break; +#ifdef CONFIG_PPC64 + case OCXL_IOCTL_ENABLE_P9_WAIT: + rc = afu_ioctl_enable_p9_wait(ctx, + (struct ocxl_ioctl_p9_wait __user *) args); + break; +#endif + default: rc = -EINVAL; } diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c index 656e8610eec2..88876ae8f330 100644 --- a/drivers/misc/ocxl/link.c +++ b/drivers/misc/ocxl/link.c @@ -544,6 +544,42 @@ int
Re: [PATCH v5 5/7] ocxl: Expose the thread_id needed for wait on POWER9
Le 11/05/2018 à 08:13, Alastair D'Silva a écrit : From: Alastair D'Silva In order to successfully issue as_notify, an AFU needs to know the TID to notify, which in turn means that this information should be available in userspace so it can be communicated to the AFU. Signed-off-by: Alastair D'Silva --- Ok, so we keep the limitation of having only one thread per context able to call 'wait', even though we don't have to worry about depleting the pool of TIDs any more. I think that's acceptable, though we don't really have a reason to justify it any more. Any reason you want to keep it that way? Fred drivers/misc/ocxl/context.c | 5 ++- drivers/misc/ocxl/file.c | 53 +++ drivers/misc/ocxl/link.c | 36 + drivers/misc/ocxl/ocxl_internal.h | 1 + include/misc/ocxl.h | 9 ++ include/uapi/misc/ocxl.h | 8 + 6 files changed, 111 insertions(+), 1 deletion(-) diff --git a/drivers/misc/ocxl/context.c b/drivers/misc/ocxl/context.c index 909e8807824a..95f74623113e 100644 --- a/drivers/misc/ocxl/context.c +++ b/drivers/misc/ocxl/context.c @@ -34,6 +34,8 @@ int ocxl_context_init(struct ocxl_context *ctx, struct ocxl_afu *afu, mutex_init(>xsl_error_lock); mutex_init(>irq_lock); idr_init(>irq_idr); + ctx->tidr = 0; + /* * Keep a reference on the AFU to make sure it's valid for the * duration of the life of the context @@ -65,6 +67,7 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr) { int rc; + // Locks both status & tidr mutex_lock(>status_mutex); if (ctx->status != OPENED) { rc = -EIO; @@ -72,7 +75,7 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr) } rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid, - current->mm->context.id, 0, amr, current->mm, + current->mm->context.id, ctx->tidr, amr, current->mm, xsl_fault_error, ctx); if (rc) goto out; diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index 038509e5d031..eb409a469f21 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -5,6 +5,8 @@ #include #include #include +#include +#include #include "ocxl_internal.h" @@ -123,11 +125,55 @@ static long afu_ioctl_get_metadata(struct ocxl_context *ctx, return 0; } +#ifdef CONFIG_PPC64 +static long afu_ioctl_enable_p9_wait(struct ocxl_context *ctx, + struct ocxl_ioctl_p9_wait __user *uarg) +{ + struct ocxl_ioctl_p9_wait arg; + + memset(, 0, sizeof(arg)); + + if (cpu_has_feature(CPU_FTR_P9_TIDR)) { + enum ocxl_context_status status; + + // Locks both status & tidr + mutex_lock(>status_mutex); + if (!ctx->tidr) { + if (set_thread_tidr(current)) + return -ENOENT; + + ctx->tidr = current->thread.tidr; + } + + status = ctx->status; + mutex_unlock(>status_mutex); + + if (status == ATTACHED) { + int rc; + struct link *link = ctx->afu->fn->link; + + rc = ocxl_link_update_pe(link, ctx->pasid, ctx->tidr); + if (rc) + return rc; + } + + arg.thread_id = ctx->tidr; + } else + return -ENOENT; + + if (copy_to_user(uarg, , sizeof(arg))) + return -EFAULT; + + return 0; +} +#endif + #define CMD_STR(x) (x == OCXL_IOCTL_ATTACH ? "ATTACH" : \ x == OCXL_IOCTL_IRQ_ALLOC ? "IRQ_ALLOC" : \ x == OCXL_IOCTL_IRQ_FREE ? "IRQ_FREE" : \ x == OCXL_IOCTL_IRQ_SET_FD ? "IRQ_SET_FD" : \ x == OCXL_IOCTL_GET_METADATA ? "GET_METADATA" : \ + x == OCXL_IOCTL_ENABLE_P9_WAIT ? "ENABLE_P9_WAIT" : \ "UNKNOWN") static long afu_ioctl(struct file *file, unsigned int cmd, @@ -186,6 +232,13 @@ static long afu_ioctl(struct file *file, unsigned int cmd, (struct ocxl_ioctl_metadata __user *) args); break; +#ifdef CONFIG_PPC64 + case OCXL_IOCTL_ENABLE_P9_WAIT: + rc = afu_ioctl_enable_p9_wait(ctx, + (struct ocxl_ioctl_p9_wait __user *) args); + break; +#endif + default: rc = -EINVAL; } diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c index 656e8610eec2..88876ae8f330 100644 --- a/drivers/misc/ocxl/link.c +++ b/drivers/misc/ocxl/link.c @@ -544,6 +544,42 @@ int ocxl_link_add_pe(void *link_handle, int
Re: [PATCH v5 4/7] ocxl: Rename pnv_ocxl_spa_remove_pe to clarify it's action
Le 11/05/2018 à 08:13, Alastair D'Silva a écrit : From: Alastair D'Silva <alast...@d-silva.org> The function removes the process element from NPU cache. Signed-off-by: Alastair D'Silva <alast...@d-silva.org> --- Acked-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> arch/powerpc/include/asm/pnv-ocxl.h | 2 +- arch/powerpc/platforms/powernv/ocxl.c | 4 ++-- drivers/misc/ocxl/link.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h index f6945d3bc971..208b5503f4ed 100644 --- a/arch/powerpc/include/asm/pnv-ocxl.h +++ b/arch/powerpc/include/asm/pnv-ocxl.h @@ -28,7 +28,7 @@ extern int pnv_ocxl_map_xsl_regs(struct pci_dev *dev, void __iomem **dsisr, extern int pnv_ocxl_spa_setup(struct pci_dev *dev, void *spa_mem, int PE_mask, void **platform_data); extern void pnv_ocxl_spa_release(void *platform_data); -extern int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle); +extern int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle); extern int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr); extern void pnv_ocxl_free_xive_irq(u32 irq); diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c index fa9b53af3c7b..8c65aacda9c8 100644 --- a/arch/powerpc/platforms/powernv/ocxl.c +++ b/arch/powerpc/platforms/powernv/ocxl.c @@ -475,7 +475,7 @@ void pnv_ocxl_spa_release(void *platform_data) } EXPORT_SYMBOL_GPL(pnv_ocxl_spa_release); -int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle) +int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle) { struct spa_data *data = (struct spa_data *) platform_data; int rc; @@ -483,7 +483,7 @@ int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle) rc = opal_npu_spa_clear_cache(data->phb_opal_id, data->bdfn, pe_handle); return rc; } -EXPORT_SYMBOL_GPL(pnv_ocxl_spa_remove_pe); +EXPORT_SYMBOL_GPL(pnv_ocxl_spa_remove_pe_from_cache); int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr) { diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c index f30790582dc0..656e8610eec2 100644 --- a/drivers/misc/ocxl/link.c +++ b/drivers/misc/ocxl/link.c @@ -599,7 +599,7 @@ int ocxl_link_remove_pe(void *link_handle, int pasid) * On powerpc, the entry needs to be cleared from the context * cache of the NPU. */ - rc = pnv_ocxl_spa_remove_pe(link->platform_data, pe_handle); + rc = pnv_ocxl_spa_remove_pe_from_cache(link->platform_data, pe_handle); WARN_ON(rc); pe_data = radix_tree_delete(>pe_tree, pe_handle);
Re: [PATCH v5 4/7] ocxl: Rename pnv_ocxl_spa_remove_pe to clarify it's action
Le 11/05/2018 à 08:13, Alastair D'Silva a écrit : From: Alastair D'Silva The function removes the process element from NPU cache. Signed-off-by: Alastair D'Silva --- Acked-by: Frederic Barrat arch/powerpc/include/asm/pnv-ocxl.h | 2 +- arch/powerpc/platforms/powernv/ocxl.c | 4 ++-- drivers/misc/ocxl/link.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h index f6945d3bc971..208b5503f4ed 100644 --- a/arch/powerpc/include/asm/pnv-ocxl.h +++ b/arch/powerpc/include/asm/pnv-ocxl.h @@ -28,7 +28,7 @@ extern int pnv_ocxl_map_xsl_regs(struct pci_dev *dev, void __iomem **dsisr, extern int pnv_ocxl_spa_setup(struct pci_dev *dev, void *spa_mem, int PE_mask, void **platform_data); extern void pnv_ocxl_spa_release(void *platform_data); -extern int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle); +extern int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle); extern int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr); extern void pnv_ocxl_free_xive_irq(u32 irq); diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c index fa9b53af3c7b..8c65aacda9c8 100644 --- a/arch/powerpc/platforms/powernv/ocxl.c +++ b/arch/powerpc/platforms/powernv/ocxl.c @@ -475,7 +475,7 @@ void pnv_ocxl_spa_release(void *platform_data) } EXPORT_SYMBOL_GPL(pnv_ocxl_spa_release); -int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle) +int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle) { struct spa_data *data = (struct spa_data *) platform_data; int rc; @@ -483,7 +483,7 @@ int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle) rc = opal_npu_spa_clear_cache(data->phb_opal_id, data->bdfn, pe_handle); return rc; } -EXPORT_SYMBOL_GPL(pnv_ocxl_spa_remove_pe); +EXPORT_SYMBOL_GPL(pnv_ocxl_spa_remove_pe_from_cache); int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr) { diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c index f30790582dc0..656e8610eec2 100644 --- a/drivers/misc/ocxl/link.c +++ b/drivers/misc/ocxl/link.c @@ -599,7 +599,7 @@ int ocxl_link_remove_pe(void *link_handle, int pasid) * On powerpc, the entry needs to be cleared from the context * cache of the NPU. */ - rc = pnv_ocxl_spa_remove_pe(link->platform_data, pe_handle); + rc = pnv_ocxl_spa_remove_pe_from_cache(link->platform_data, pe_handle); WARN_ON(rc); pe_data = radix_tree_delete(>pe_tree, pe_handle);
Re: [PATCH v5 3/7] powerpc: use task_pid_nr() for TID allocation
Le 11/05/2018 à 08:12, Alastair D'Silva a écrit : From: Alastair D'Silva <alast...@d-silva.org> The current implementation of TID allocation, using a global IDR, may result in an errant process starving the system of available TIDs. Instead, use task_pid_nr(), as mentioned by the original author. The scenario described which prevented it's use is not applicable, as set_thread_tidr can only be called after the task struct has been populated. In the unlikely event that 2 threads share the TID and are waiting, all potential outcomes have been determined safe. Signed-off-by: Alastair D'Silva <alast...@d-silva.org> --- Thanks for adding the comment. It assumes the reader is aware that the TIDR value is only used for the notification using the 'wait' instruction, but that's likely to be the case. Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> arch/powerpc/include/asm/switch_to.h | 1 - arch/powerpc/kernel/process.c| 122 ++- 2 files changed, 28 insertions(+), 95 deletions(-) diff --git a/arch/powerpc/include/asm/switch_to.h b/arch/powerpc/include/asm/switch_to.h index be8c9fa23983..5b03d8a82409 100644 --- a/arch/powerpc/include/asm/switch_to.h +++ b/arch/powerpc/include/asm/switch_to.h @@ -94,6 +94,5 @@ static inline void clear_task_ebb(struct task_struct *t) extern int set_thread_uses_vas(void); extern int set_thread_tidr(struct task_struct *t); -extern void clear_thread_tidr(struct task_struct *t); #endif /* _ASM_POWERPC_SWITCH_TO_H */ diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 3b00da47699b..c5b8e53acbae 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1496,103 +1496,41 @@ int set_thread_uses_vas(void) } #ifdef CONFIG_PPC64 -static DEFINE_SPINLOCK(vas_thread_id_lock); -static DEFINE_IDA(vas_thread_ida); - -/* - * We need to assign a unique thread id to each thread in a process. +/** + * Assign a TIDR (thread ID) for task @t and set it in the thread + * structure. For now, we only support setting TIDR for 'current' task. * - * This thread id, referred to as TIDR, and separate from the Linux's tgid, - * is intended to be used to direct an ASB_Notify from the hardware to the - * thread, when a suitable event occurs in the system. + * Since the TID value is a truncated form of it PID, it is possible + * (but unlikely) for 2 threads to have the same TID. In the unlikely event + * that 2 threads share the same TID and are waiting, one of the following + * cases will happen: * - * One such event is a "paste" instruction in the context of Fast Thread - * Wakeup (aka Core-to-core wake up in the Virtual Accelerator Switchboard - * (VAS) in POWER9. + * 1. The correct thread is running, the wrong thread is not + * In this situation, the correct thread is woken and proceeds to pass it's + * condition check. * - * To get a unique TIDR per process we could simply reuse task_pid_nr() but - * the problem is that task_pid_nr() is not yet available copy_thread() is - * called. Fixing that would require changing more intrusive arch-neutral - * code in code path in copy_process()?. + * 2. Neither threads are running + * In this situation, neither thread will be woken. When scheduled, the waiting + * threads will execute either a wait, which will return immediately, followed + * by a condition check, which will pass for the correct thread and fail + * for the wrong thread, or they will execute the condition check immediately. * - * Further, to assign unique TIDRs within each process, we need an atomic - * field (or an IDR) in task_struct, which again intrudes into the arch- - * neutral code. So try to assign globally unique TIDRs for now. + * 3. The wrong thread is running, the correct thread is not + * The wrong thread will be woken, but will fail it's condition check and + * re-execute wait. The correct thread, when scheduled, will execute either + * it's condition check (which will pass), or wait, which returns immediately + * when called the first time after the thread is scheduled, followed by it's + * condition check (which will pass). * - * NOTE: TIDR 0 indicates that the thread does not need a TIDR value. - * For now, only threads that expect to be notified by the VAS - * hardware need a TIDR value and we assign values > 0 for those. - */ -#define MAX_THREAD_CONTEXT ((1 << 16) - 1) -static int assign_thread_tidr(void) -{ - int index; - int err; - unsigned long flags; - -again: - if (!ida_pre_get(_thread_ida, GFP_KERNEL)) - return -ENOMEM; - - spin_lock_irqsave(_thread_id_lock, flags); - err = ida_get_new_above(_thread_ida, 1, ); - spin_unlock_irqrestore(_thread_id_lock, flags); - - if (err == -EAGAIN) - goto again; - else if (err) - return err; - - if (index > MAX_THREAD_CONTEXT) { -
Re: [PATCH v5 3/7] powerpc: use task_pid_nr() for TID allocation
Le 11/05/2018 à 08:12, Alastair D'Silva a écrit : From: Alastair D'Silva The current implementation of TID allocation, using a global IDR, may result in an errant process starving the system of available TIDs. Instead, use task_pid_nr(), as mentioned by the original author. The scenario described which prevented it's use is not applicable, as set_thread_tidr can only be called after the task struct has been populated. In the unlikely event that 2 threads share the TID and are waiting, all potential outcomes have been determined safe. Signed-off-by: Alastair D'Silva --- Thanks for adding the comment. It assumes the reader is aware that the TIDR value is only used for the notification using the 'wait' instruction, but that's likely to be the case. Reviewed-by: Frederic Barrat arch/powerpc/include/asm/switch_to.h | 1 - arch/powerpc/kernel/process.c| 122 ++- 2 files changed, 28 insertions(+), 95 deletions(-) diff --git a/arch/powerpc/include/asm/switch_to.h b/arch/powerpc/include/asm/switch_to.h index be8c9fa23983..5b03d8a82409 100644 --- a/arch/powerpc/include/asm/switch_to.h +++ b/arch/powerpc/include/asm/switch_to.h @@ -94,6 +94,5 @@ static inline void clear_task_ebb(struct task_struct *t) extern int set_thread_uses_vas(void); extern int set_thread_tidr(struct task_struct *t); -extern void clear_thread_tidr(struct task_struct *t); #endif /* _ASM_POWERPC_SWITCH_TO_H */ diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 3b00da47699b..c5b8e53acbae 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1496,103 +1496,41 @@ int set_thread_uses_vas(void) } #ifdef CONFIG_PPC64 -static DEFINE_SPINLOCK(vas_thread_id_lock); -static DEFINE_IDA(vas_thread_ida); - -/* - * We need to assign a unique thread id to each thread in a process. +/** + * Assign a TIDR (thread ID) for task @t and set it in the thread + * structure. For now, we only support setting TIDR for 'current' task. * - * This thread id, referred to as TIDR, and separate from the Linux's tgid, - * is intended to be used to direct an ASB_Notify from the hardware to the - * thread, when a suitable event occurs in the system. + * Since the TID value is a truncated form of it PID, it is possible + * (but unlikely) for 2 threads to have the same TID. In the unlikely event + * that 2 threads share the same TID and are waiting, one of the following + * cases will happen: * - * One such event is a "paste" instruction in the context of Fast Thread - * Wakeup (aka Core-to-core wake up in the Virtual Accelerator Switchboard - * (VAS) in POWER9. + * 1. The correct thread is running, the wrong thread is not + * In this situation, the correct thread is woken and proceeds to pass it's + * condition check. * - * To get a unique TIDR per process we could simply reuse task_pid_nr() but - * the problem is that task_pid_nr() is not yet available copy_thread() is - * called. Fixing that would require changing more intrusive arch-neutral - * code in code path in copy_process()?. + * 2. Neither threads are running + * In this situation, neither thread will be woken. When scheduled, the waiting + * threads will execute either a wait, which will return immediately, followed + * by a condition check, which will pass for the correct thread and fail + * for the wrong thread, or they will execute the condition check immediately. * - * Further, to assign unique TIDRs within each process, we need an atomic - * field (or an IDR) in task_struct, which again intrudes into the arch- - * neutral code. So try to assign globally unique TIDRs for now. + * 3. The wrong thread is running, the correct thread is not + * The wrong thread will be woken, but will fail it's condition check and + * re-execute wait. The correct thread, when scheduled, will execute either + * it's condition check (which will pass), or wait, which returns immediately + * when called the first time after the thread is scheduled, followed by it's + * condition check (which will pass). * - * NOTE: TIDR 0 indicates that the thread does not need a TIDR value. - * For now, only threads that expect to be notified by the VAS - * hardware need a TIDR value and we assign values > 0 for those. - */ -#define MAX_THREAD_CONTEXT ((1 << 16) - 1) -static int assign_thread_tidr(void) -{ - int index; - int err; - unsigned long flags; - -again: - if (!ida_pre_get(_thread_ida, GFP_KERNEL)) - return -ENOMEM; - - spin_lock_irqsave(_thread_id_lock, flags); - err = ida_get_new_above(_thread_ida, 1, ); - spin_unlock_irqrestore(_thread_id_lock, flags); - - if (err == -EAGAIN) - goto again; - else if (err) - return err; - - if (index > MAX_THREAD_CONTEXT) { - spin_lock_irqsave(_thread_id_lock, flags); - ida_rem
Re: [PATCH v5 2/7] powerpc: Use TIDR CPU feature to control TIDR allocation
Le 11/05/2018 à 08:12, Alastair D'Silva a écrit : From: Alastair D'Silva <alast...@d-silva.org> Switch the use of TIDR on it's CPU feature, rather than assuming it is available based on architecture. Signed-off-by: Alastair D'Silva <alast...@d-silva.org> --- Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> arch/powerpc/kernel/process.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 1237f13fed51..3b00da47699b 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1154,7 +1154,7 @@ static inline void restore_sprs(struct thread_struct *old_thread, mtspr(SPRN_TAR, new_thread->tar); } - if (cpu_has_feature(CPU_FTR_ARCH_300) && + if (cpu_has_feature(CPU_FTR_P9_TIDR) && old_thread->tidr != new_thread->tidr) mtspr(SPRN_TIDR, new_thread->tidr); #endif @@ -1570,7 +1570,7 @@ void clear_thread_tidr(struct task_struct *t) if (!t->thread.tidr) return; - if (!cpu_has_feature(CPU_FTR_ARCH_300)) { + if (!cpu_has_feature(CPU_FTR_P9_TIDR)) { WARN_ON_ONCE(1); return; } @@ -1593,7 +1593,7 @@ int set_thread_tidr(struct task_struct *t) { int rc; - if (!cpu_has_feature(CPU_FTR_ARCH_300)) + if (!cpu_has_feature(CPU_FTR_P9_TIDR)) return -EINVAL; if (t != current)
Re: [PATCH v5 2/7] powerpc: Use TIDR CPU feature to control TIDR allocation
Le 11/05/2018 à 08:12, Alastair D'Silva a écrit : From: Alastair D'Silva Switch the use of TIDR on it's CPU feature, rather than assuming it is available based on architecture. Signed-off-by: Alastair D'Silva --- Reviewed-by: Frederic Barrat arch/powerpc/kernel/process.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 1237f13fed51..3b00da47699b 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1154,7 +1154,7 @@ static inline void restore_sprs(struct thread_struct *old_thread, mtspr(SPRN_TAR, new_thread->tar); } - if (cpu_has_feature(CPU_FTR_ARCH_300) && + if (cpu_has_feature(CPU_FTR_P9_TIDR) && old_thread->tidr != new_thread->tidr) mtspr(SPRN_TIDR, new_thread->tidr); #endif @@ -1570,7 +1570,7 @@ void clear_thread_tidr(struct task_struct *t) if (!t->thread.tidr) return; - if (!cpu_has_feature(CPU_FTR_ARCH_300)) { + if (!cpu_has_feature(CPU_FTR_P9_TIDR)) { WARN_ON_ONCE(1); return; } @@ -1593,7 +1593,7 @@ int set_thread_tidr(struct task_struct *t) { int rc; - if (!cpu_has_feature(CPU_FTR_ARCH_300)) + if (!cpu_has_feature(CPU_FTR_P9_TIDR)) return -EINVAL; if (t != current)
Re: [PATCH v5 1/7] powerpc: Add TIDR CPU feature for POWER9
Le 11/05/2018 à 08:12, Alastair D'Silva a écrit : From: Alastair D'Silva <alast...@d-silva.org> This patch adds a CPU feature bit to show whether the CPU has the TIDR register available, enabling as_notify/wait in userspace. Signed-off-by: Alastair D'Silva <alast...@d-silva.org> --- Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> arch/powerpc/include/asm/cputable.h | 3 ++- arch/powerpc/kernel/dt_cpu_ftrs.c | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h index 66fcab13c8b4..9c0a3083571b 100644 --- a/arch/powerpc/include/asm/cputable.h +++ b/arch/powerpc/include/asm/cputable.h @@ -215,6 +215,7 @@ static inline void cpu_feature_keys_init(void) { } #define CPU_FTR_P9_TM_HV_ASSIST LONG_ASM_CONST(0x1000) #define CPU_FTR_P9_TM_XER_SO_BUG LONG_ASM_CONST(0x2000) #define CPU_FTR_P9_TLBIE_BUG LONG_ASM_CONST(0x4000) +#define CPU_FTR_P9_TIDR LONG_ASM_CONST(0x8000) #ifndef __ASSEMBLY__ @@ -462,7 +463,7 @@ static inline void cpu_feature_keys_init(void) { } CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \ CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \ CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_PKEY | \ - CPU_FTR_P9_TLBIE_BUG) + CPU_FTR_P9_TLBIE_BUG | CPU_FTR_P9_TIDR) #define CPU_FTRS_POWER9_DD1 ((CPU_FTRS_POWER9 | CPU_FTR_POWER9_DD1) & \ (~CPU_FTR_SAO)) #define CPU_FTRS_POWER9_DD2_0 CPU_FTRS_POWER9 diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c index 8ab51f6ca03a..41e5b69f 100644 --- a/arch/powerpc/kernel/dt_cpu_ftrs.c +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c @@ -716,6 +716,7 @@ static __init void cpufeatures_cpu_quirks(void) if ((version & 0x) == 0x004e) { cur_cpu_spec->cpu_features &= ~(CPU_FTR_DAWR); cur_cpu_spec->cpu_features |= CPU_FTR_P9_TLBIE_BUG; + cur_cpu_spec->cpu_features |= CPU_FTR_P9_TIDR; } /*
Re: [PATCH v5 1/7] powerpc: Add TIDR CPU feature for POWER9
Le 11/05/2018 à 08:12, Alastair D'Silva a écrit : From: Alastair D'Silva This patch adds a CPU feature bit to show whether the CPU has the TIDR register available, enabling as_notify/wait in userspace. Signed-off-by: Alastair D'Silva --- Reviewed-by: Frederic Barrat arch/powerpc/include/asm/cputable.h | 3 ++- arch/powerpc/kernel/dt_cpu_ftrs.c | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h index 66fcab13c8b4..9c0a3083571b 100644 --- a/arch/powerpc/include/asm/cputable.h +++ b/arch/powerpc/include/asm/cputable.h @@ -215,6 +215,7 @@ static inline void cpu_feature_keys_init(void) { } #define CPU_FTR_P9_TM_HV_ASSIST LONG_ASM_CONST(0x1000) #define CPU_FTR_P9_TM_XER_SO_BUG LONG_ASM_CONST(0x2000) #define CPU_FTR_P9_TLBIE_BUG LONG_ASM_CONST(0x4000) +#define CPU_FTR_P9_TIDR LONG_ASM_CONST(0x8000) #ifndef __ASSEMBLY__ @@ -462,7 +463,7 @@ static inline void cpu_feature_keys_init(void) { } CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \ CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \ CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_PKEY | \ - CPU_FTR_P9_TLBIE_BUG) + CPU_FTR_P9_TLBIE_BUG | CPU_FTR_P9_TIDR) #define CPU_FTRS_POWER9_DD1 ((CPU_FTRS_POWER9 | CPU_FTR_POWER9_DD1) & \ (~CPU_FTR_SAO)) #define CPU_FTRS_POWER9_DD2_0 CPU_FTRS_POWER9 diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c index 8ab51f6ca03a..41e5b69f 100644 --- a/arch/powerpc/kernel/dt_cpu_ftrs.c +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c @@ -716,6 +716,7 @@ static __init void cpufeatures_cpu_quirks(void) if ((version & 0x) == 0x004e) { cur_cpu_spec->cpu_features &= ~(CPU_FTR_DAWR); cur_cpu_spec->cpu_features |= CPU_FTR_P9_TLBIE_BUG; + cur_cpu_spec->cpu_features |= CPU_FTR_P9_TIDR; } /*
Re: [PATCH v2 7/7] ocxl: Document new OCXL IOCTLs
Le 18/04/2018 à 03:08, Alastair D'Silva a écrit : From: Alastair D'Silva <alast...@d-silva.org> Signed-off-by: Alastair D'Silva <alast...@d-silva.org> --- Acked-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> Fred Documentation/accelerators/ocxl.rst | 11 +++ 1 file changed, 11 insertions(+) diff --git a/Documentation/accelerators/ocxl.rst b/Documentation/accelerators/ocxl.rst index 7904adcc07fd..3b8d3b99795c 100644 --- a/Documentation/accelerators/ocxl.rst +++ b/Documentation/accelerators/ocxl.rst @@ -157,6 +157,17 @@ OCXL_IOCTL_GET_METADATA: Obtains configuration information from the card, such at the size of MMIO areas, the AFU version, and the PASID for the current context. +OCXL_IOCTL_ENABLE_P9_WAIT: + + Allows the AFU to wake a userspace thread executing 'wait'. Returns + information to userspace to allow it to configure the AFU. Note that + this is only available on Power 9. + +OCXL_IOCTL_GET_FEATURES: + + Reports on which CPU features that affect OpenCAPI are usable from + userspace. + mmap
Re: [PATCH v2 7/7] ocxl: Document new OCXL IOCTLs
Le 18/04/2018 à 03:08, Alastair D'Silva a écrit : From: Alastair D'Silva Signed-off-by: Alastair D'Silva --- Acked-by: Frederic Barrat Fred Documentation/accelerators/ocxl.rst | 11 +++ 1 file changed, 11 insertions(+) diff --git a/Documentation/accelerators/ocxl.rst b/Documentation/accelerators/ocxl.rst index 7904adcc07fd..3b8d3b99795c 100644 --- a/Documentation/accelerators/ocxl.rst +++ b/Documentation/accelerators/ocxl.rst @@ -157,6 +157,17 @@ OCXL_IOCTL_GET_METADATA: Obtains configuration information from the card, such at the size of MMIO areas, the AFU version, and the PASID for the current context. +OCXL_IOCTL_ENABLE_P9_WAIT: + + Allows the AFU to wake a userspace thread executing 'wait'. Returns + information to userspace to allow it to configure the AFU. Note that + this is only available on Power 9. + +OCXL_IOCTL_GET_FEATURES: + + Reports on which CPU features that affect OpenCAPI are usable from + userspace. + mmap
Re: [PATCH v2 6/7] ocxl: Add an IOCTL so userspace knows what CPU features are available
Le 18/04/2018 à 03:08, Alastair D'Silva a écrit : From: Alastair D'Silva <alast...@d-silva.org> In order for a userspace AFU driver to call the Power9 specific OCXL_IOCTL_ENABLE_P9_WAIT, it needs to verify that it can actually make that call. Signed-off-by: Alastair D'Silva <alast...@d-silva.org> --- Documentation/accelerators/ocxl.rst | 1 - drivers/misc/ocxl/file.c| 25 + include/uapi/misc/ocxl.h| 4 3 files changed, 29 insertions(+), 1 deletion(-) diff --git a/Documentation/accelerators/ocxl.rst b/Documentation/accelerators/ocxl.rst index ddcc58d01cfb..7904adcc07fd 100644 --- a/Documentation/accelerators/ocxl.rst +++ b/Documentation/accelerators/ocxl.rst @@ -157,7 +157,6 @@ OCXL_IOCTL_GET_METADATA: Obtains configuration information from the card, such at the size of MMIO areas, the AFU version, and the PASID for the current context. - Intended? Other than that, Acked-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> mmap diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index eb409a469f21..33ae46ce0a8a 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -168,12 +168,32 @@ static long afu_ioctl_enable_p9_wait(struct ocxl_context *ctx, } #endif + +static long afu_ioctl_get_features(struct ocxl_context *ctx, + struct ocxl_ioctl_features __user *uarg) +{ + struct ocxl_ioctl_features arg; + + memset(, 0, sizeof(arg)); + +#ifdef CONFIG_PPC64 + if (cpu_has_feature(CPU_FTR_P9_TIDR)) + arg.flags[0] |= OCXL_IOCTL_FEATURES_FLAGS0_P9_WAIT; +#endif + + if (copy_to_user(uarg, , sizeof(arg))) + return -EFAULT; + + return 0; +} + #define CMD_STR(x) (x == OCXL_IOCTL_ATTACH ? "ATTACH" : \ x == OCXL_IOCTL_IRQ_ALLOC ? "IRQ_ALLOC" : \ x == OCXL_IOCTL_IRQ_FREE ? "IRQ_FREE" : \ x == OCXL_IOCTL_IRQ_SET_FD ? "IRQ_SET_FD" : \ x == OCXL_IOCTL_GET_METADATA ? "GET_METADATA" : \ x == OCXL_IOCTL_ENABLE_P9_WAIT ? "ENABLE_P9_WAIT" : \ + x == OCXL_IOCTL_GET_FEATURES ? "GET_FEATURES" : \ "UNKNOWN") static long afu_ioctl(struct file *file, unsigned int cmd, @@ -239,6 +259,11 @@ static long afu_ioctl(struct file *file, unsigned int cmd, break; #endif + case OCXL_IOCTL_GET_FEATURES: + rc = afu_ioctl_get_features(ctx, + (struct ocxl_ioctl_features __user *) args); + break; + default: rc = -EINVAL; } diff --git a/include/uapi/misc/ocxl.h b/include/uapi/misc/ocxl.h index 8d2748e69c84..bb80f294b429 100644 --- a/include/uapi/misc/ocxl.h +++ b/include/uapi/misc/ocxl.h @@ -55,6 +55,9 @@ struct ocxl_ioctl_p9_wait { __u64 reserved3[3]; }; +#define OCXL_IOCTL_FEATURES_FLAGS0_P9_WAIT 0x01 +struct ocxl_ioctl_features { + __u64 flags[4]; }; struct ocxl_ioctl_irq_fd { @@ -72,5 +75,6 @@ struct ocxl_ioctl_irq_fd { #define OCXL_IOCTL_IRQ_SET_FD _IOW(OCXL_MAGIC, 0x13, struct ocxl_ioctl_irq_fd) #define OCXL_IOCTL_GET_METADATA _IOR(OCXL_MAGIC, 0x14, struct ocxl_ioctl_metadata) #define OCXL_IOCTL_ENABLE_P9_WAIT _IOR(OCXL_MAGIC, 0x15, struct ocxl_ioctl_p9_wait) +#define OCXL_IOCTL_GET_FEATURES _IOR(OCXL_MAGIC, 0x16, struct ocxl_ioctl_platform) #endif /* _UAPI_MISC_OCXL_H */
Re: [PATCH v2 6/7] ocxl: Add an IOCTL so userspace knows what CPU features are available
Le 18/04/2018 à 03:08, Alastair D'Silva a écrit : From: Alastair D'Silva In order for a userspace AFU driver to call the Power9 specific OCXL_IOCTL_ENABLE_P9_WAIT, it needs to verify that it can actually make that call. Signed-off-by: Alastair D'Silva --- Documentation/accelerators/ocxl.rst | 1 - drivers/misc/ocxl/file.c| 25 + include/uapi/misc/ocxl.h| 4 3 files changed, 29 insertions(+), 1 deletion(-) diff --git a/Documentation/accelerators/ocxl.rst b/Documentation/accelerators/ocxl.rst index ddcc58d01cfb..7904adcc07fd 100644 --- a/Documentation/accelerators/ocxl.rst +++ b/Documentation/accelerators/ocxl.rst @@ -157,7 +157,6 @@ OCXL_IOCTL_GET_METADATA: Obtains configuration information from the card, such at the size of MMIO areas, the AFU version, and the PASID for the current context. - Intended? Other than that, Acked-by: Frederic Barrat mmap diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index eb409a469f21..33ae46ce0a8a 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -168,12 +168,32 @@ static long afu_ioctl_enable_p9_wait(struct ocxl_context *ctx, } #endif + +static long afu_ioctl_get_features(struct ocxl_context *ctx, + struct ocxl_ioctl_features __user *uarg) +{ + struct ocxl_ioctl_features arg; + + memset(, 0, sizeof(arg)); + +#ifdef CONFIG_PPC64 + if (cpu_has_feature(CPU_FTR_P9_TIDR)) + arg.flags[0] |= OCXL_IOCTL_FEATURES_FLAGS0_P9_WAIT; +#endif + + if (copy_to_user(uarg, , sizeof(arg))) + return -EFAULT; + + return 0; +} + #define CMD_STR(x) (x == OCXL_IOCTL_ATTACH ? "ATTACH" : \ x == OCXL_IOCTL_IRQ_ALLOC ? "IRQ_ALLOC" : \ x == OCXL_IOCTL_IRQ_FREE ? "IRQ_FREE" : \ x == OCXL_IOCTL_IRQ_SET_FD ? "IRQ_SET_FD" : \ x == OCXL_IOCTL_GET_METADATA ? "GET_METADATA" : \ x == OCXL_IOCTL_ENABLE_P9_WAIT ? "ENABLE_P9_WAIT" : \ + x == OCXL_IOCTL_GET_FEATURES ? "GET_FEATURES" : \ "UNKNOWN") static long afu_ioctl(struct file *file, unsigned int cmd, @@ -239,6 +259,11 @@ static long afu_ioctl(struct file *file, unsigned int cmd, break; #endif + case OCXL_IOCTL_GET_FEATURES: + rc = afu_ioctl_get_features(ctx, + (struct ocxl_ioctl_features __user *) args); + break; + default: rc = -EINVAL; } diff --git a/include/uapi/misc/ocxl.h b/include/uapi/misc/ocxl.h index 8d2748e69c84..bb80f294b429 100644 --- a/include/uapi/misc/ocxl.h +++ b/include/uapi/misc/ocxl.h @@ -55,6 +55,9 @@ struct ocxl_ioctl_p9_wait { __u64 reserved3[3]; }; +#define OCXL_IOCTL_FEATURES_FLAGS0_P9_WAIT 0x01 +struct ocxl_ioctl_features { + __u64 flags[4]; }; struct ocxl_ioctl_irq_fd { @@ -72,5 +75,6 @@ struct ocxl_ioctl_irq_fd { #define OCXL_IOCTL_IRQ_SET_FD _IOW(OCXL_MAGIC, 0x13, struct ocxl_ioctl_irq_fd) #define OCXL_IOCTL_GET_METADATA _IOR(OCXL_MAGIC, 0x14, struct ocxl_ioctl_metadata) #define OCXL_IOCTL_ENABLE_P9_WAIT _IOR(OCXL_MAGIC, 0x15, struct ocxl_ioctl_p9_wait) +#define OCXL_IOCTL_GET_FEATURES _IOR(OCXL_MAGIC, 0x16, struct ocxl_ioctl_platform) #endif /* _UAPI_MISC_OCXL_H */
Re: [PATCH v2 5/7] ocxl: Expose the thread_id needed for wait on p9
Le 18/04/2018 à 03:08, Alastair D'Silva a écrit : From: Alastair D'SilvaIn order to successfully issue as_notify, an AFU needs to know the TID to notify, which in turn means that this information should be available in userspace so it can be communicated to the AFU. Signed-off-by: Alastair D'Silva --- drivers/misc/ocxl/context.c | 5 +++- drivers/misc/ocxl/file.c | 53 +++ drivers/misc/ocxl/link.c | 36 ++ drivers/misc/ocxl/ocxl_internal.h | 1 + include/misc/ocxl.h | 9 +++ include/uapi/misc/ocxl.h | 10 6 files changed, 113 insertions(+), 1 deletion(-) diff --git a/drivers/misc/ocxl/context.c b/drivers/misc/ocxl/context.c index 909e8807824a..95f74623113e 100644 --- a/drivers/misc/ocxl/context.c +++ b/drivers/misc/ocxl/context.c @@ -34,6 +34,8 @@ int ocxl_context_init(struct ocxl_context *ctx, struct ocxl_afu *afu, mutex_init(>xsl_error_lock); mutex_init(>irq_lock); idr_init(>irq_idr); + ctx->tidr = 0; + /* * Keep a reference on the AFU to make sure it's valid for the * duration of the life of the context @@ -65,6 +67,7 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr) { int rc; + // Locks both status & tidr mutex_lock(>status_mutex); if (ctx->status != OPENED) { rc = -EIO; @@ -72,7 +75,7 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr) } rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid, - current->mm->context.id, 0, amr, current->mm, + current->mm->context.id, ctx->tidr, amr, current->mm, xsl_fault_error, ctx); if (rc) goto out; diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index 038509e5d031..eb409a469f21 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -5,6 +5,8 @@ #include #include #include +#include +#include #include "ocxl_internal.h" @@ -123,11 +125,55 @@ static long afu_ioctl_get_metadata(struct ocxl_context *ctx, return 0; } +#ifdef CONFIG_PPC64 +static long afu_ioctl_enable_p9_wait(struct ocxl_context *ctx, + struct ocxl_ioctl_p9_wait __user *uarg) +{ + struct ocxl_ioctl_p9_wait arg; + + memset(, 0, sizeof(arg)); + + if (cpu_has_feature(CPU_FTR_P9_TIDR)) { + enum ocxl_context_status status; + + // Locks both status & tidr + mutex_lock(>status_mutex); + if (!ctx->tidr) { + if (set_thread_tidr(current)) + return -ENOENT; + + ctx->tidr = current->thread.tidr; + } Now that we don't have the TIDR limit problem, I'm wondering if we cannot relax our rule a bit and have: - first thread to enable will become the default thread and update the Process element - any subsequent enable would just allocate the TIDR for the calling thread. That way, more than one thread could be used for 'wait'. Thoughts? Fred + + status = ctx->status; + mutex_unlock(>status_mutex); + + if (status == ATTACHED) { + int rc; + struct link *link = ctx->afu->fn->link; + + rc = ocxl_link_update_pe(link, ctx->pasid, ctx->tidr); + if (rc) + return rc; + } + + arg.thread_id = ctx->tidr; + } else + return -ENOENT; + + if (copy_to_user(uarg, , sizeof(arg))) + return -EFAULT; + + return 0; +} +#endif + #define CMD_STR(x) (x == OCXL_IOCTL_ATTACH ? "ATTACH" : \ x == OCXL_IOCTL_IRQ_ALLOC ? "IRQ_ALLOC" : \ x == OCXL_IOCTL_IRQ_FREE ? "IRQ_FREE" : \ x == OCXL_IOCTL_IRQ_SET_FD ? "IRQ_SET_FD" : \ x == OCXL_IOCTL_GET_METADATA ? "GET_METADATA" : \ + x == OCXL_IOCTL_ENABLE_P9_WAIT ? "ENABLE_P9_WAIT" : \ "UNKNOWN") static long afu_ioctl(struct file *file, unsigned int cmd, @@ -186,6 +232,13 @@ static long afu_ioctl(struct file *file, unsigned int cmd, (struct ocxl_ioctl_metadata __user *) args); break; +#ifdef CONFIG_PPC64 + case OCXL_IOCTL_ENABLE_P9_WAIT: + rc = afu_ioctl_enable_p9_wait(ctx, + (struct ocxl_ioctl_p9_wait __user *) args); + break; +#endif + default: rc = -EINVAL; } diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c index 656e8610eec2..88876ae8f330 100644 ---
Re: [PATCH v2 5/7] ocxl: Expose the thread_id needed for wait on p9
Le 18/04/2018 à 03:08, Alastair D'Silva a écrit : From: Alastair D'Silva In order to successfully issue as_notify, an AFU needs to know the TID to notify, which in turn means that this information should be available in userspace so it can be communicated to the AFU. Signed-off-by: Alastair D'Silva --- drivers/misc/ocxl/context.c | 5 +++- drivers/misc/ocxl/file.c | 53 +++ drivers/misc/ocxl/link.c | 36 ++ drivers/misc/ocxl/ocxl_internal.h | 1 + include/misc/ocxl.h | 9 +++ include/uapi/misc/ocxl.h | 10 6 files changed, 113 insertions(+), 1 deletion(-) diff --git a/drivers/misc/ocxl/context.c b/drivers/misc/ocxl/context.c index 909e8807824a..95f74623113e 100644 --- a/drivers/misc/ocxl/context.c +++ b/drivers/misc/ocxl/context.c @@ -34,6 +34,8 @@ int ocxl_context_init(struct ocxl_context *ctx, struct ocxl_afu *afu, mutex_init(>xsl_error_lock); mutex_init(>irq_lock); idr_init(>irq_idr); + ctx->tidr = 0; + /* * Keep a reference on the AFU to make sure it's valid for the * duration of the life of the context @@ -65,6 +67,7 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr) { int rc; + // Locks both status & tidr mutex_lock(>status_mutex); if (ctx->status != OPENED) { rc = -EIO; @@ -72,7 +75,7 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr) } rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid, - current->mm->context.id, 0, amr, current->mm, + current->mm->context.id, ctx->tidr, amr, current->mm, xsl_fault_error, ctx); if (rc) goto out; diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index 038509e5d031..eb409a469f21 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -5,6 +5,8 @@ #include #include #include +#include +#include #include "ocxl_internal.h" @@ -123,11 +125,55 @@ static long afu_ioctl_get_metadata(struct ocxl_context *ctx, return 0; } +#ifdef CONFIG_PPC64 +static long afu_ioctl_enable_p9_wait(struct ocxl_context *ctx, + struct ocxl_ioctl_p9_wait __user *uarg) +{ + struct ocxl_ioctl_p9_wait arg; + + memset(, 0, sizeof(arg)); + + if (cpu_has_feature(CPU_FTR_P9_TIDR)) { + enum ocxl_context_status status; + + // Locks both status & tidr + mutex_lock(>status_mutex); + if (!ctx->tidr) { + if (set_thread_tidr(current)) + return -ENOENT; + + ctx->tidr = current->thread.tidr; + } Now that we don't have the TIDR limit problem, I'm wondering if we cannot relax our rule a bit and have: - first thread to enable will become the default thread and update the Process element - any subsequent enable would just allocate the TIDR for the calling thread. That way, more than one thread could be used for 'wait'. Thoughts? Fred + + status = ctx->status; + mutex_unlock(>status_mutex); + + if (status == ATTACHED) { + int rc; + struct link *link = ctx->afu->fn->link; + + rc = ocxl_link_update_pe(link, ctx->pasid, ctx->tidr); + if (rc) + return rc; + } + + arg.thread_id = ctx->tidr; + } else + return -ENOENT; + + if (copy_to_user(uarg, , sizeof(arg))) + return -EFAULT; + + return 0; +} +#endif + #define CMD_STR(x) (x == OCXL_IOCTL_ATTACH ? "ATTACH" : \ x == OCXL_IOCTL_IRQ_ALLOC ? "IRQ_ALLOC" : \ x == OCXL_IOCTL_IRQ_FREE ? "IRQ_FREE" : \ x == OCXL_IOCTL_IRQ_SET_FD ? "IRQ_SET_FD" : \ x == OCXL_IOCTL_GET_METADATA ? "GET_METADATA" : \ + x == OCXL_IOCTL_ENABLE_P9_WAIT ? "ENABLE_P9_WAIT" : \ "UNKNOWN") static long afu_ioctl(struct file *file, unsigned int cmd, @@ -186,6 +232,13 @@ static long afu_ioctl(struct file *file, unsigned int cmd, (struct ocxl_ioctl_metadata __user *) args); break; +#ifdef CONFIG_PPC64 + case OCXL_IOCTL_ENABLE_P9_WAIT: + rc = afu_ioctl_enable_p9_wait(ctx, + (struct ocxl_ioctl_p9_wait __user *) args); + break; +#endif + default: rc = -EINVAL; } diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c index 656e8610eec2..88876ae8f330 100644 --- a/drivers/misc/ocxl/link.c +++ b/drivers/misc/ocxl/link.c @@ -544,6
Re: [PATCH v2 4/7] ocxl: Rename pnv_ocxl_spa_remove_pe to clarify it's action
Le 18/04/2018 à 03:08, Alastair D'Silva a écrit : From: Alastair D'Silva <alast...@d-silva.org> The function removes the process element from NPU cache. Signed-off-by: Alastair D'Silva <alast...@d-silva.org> --- Acked-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> arch/powerpc/include/asm/pnv-ocxl.h | 2 +- arch/powerpc/platforms/powernv/ocxl.c | 4 ++-- drivers/misc/ocxl/link.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h index f6945d3bc971..208b5503f4ed 100644 --- a/arch/powerpc/include/asm/pnv-ocxl.h +++ b/arch/powerpc/include/asm/pnv-ocxl.h @@ -28,7 +28,7 @@ extern int pnv_ocxl_map_xsl_regs(struct pci_dev *dev, void __iomem **dsisr, extern int pnv_ocxl_spa_setup(struct pci_dev *dev, void *spa_mem, int PE_mask, void **platform_data); extern void pnv_ocxl_spa_release(void *platform_data); -extern int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle); +extern int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle); extern int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr); extern void pnv_ocxl_free_xive_irq(u32 irq); diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c index fa9b53af3c7b..8c65aacda9c8 100644 --- a/arch/powerpc/platforms/powernv/ocxl.c +++ b/arch/powerpc/platforms/powernv/ocxl.c @@ -475,7 +475,7 @@ void pnv_ocxl_spa_release(void *platform_data) } EXPORT_SYMBOL_GPL(pnv_ocxl_spa_release); -int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle) +int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle) { struct spa_data *data = (struct spa_data *) platform_data; int rc; @@ -483,7 +483,7 @@ int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle) rc = opal_npu_spa_clear_cache(data->phb_opal_id, data->bdfn, pe_handle); return rc; } -EXPORT_SYMBOL_GPL(pnv_ocxl_spa_remove_pe); +EXPORT_SYMBOL_GPL(pnv_ocxl_spa_remove_pe_from_cache); int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr) { diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c index f30790582dc0..656e8610eec2 100644 --- a/drivers/misc/ocxl/link.c +++ b/drivers/misc/ocxl/link.c @@ -599,7 +599,7 @@ int ocxl_link_remove_pe(void *link_handle, int pasid) * On powerpc, the entry needs to be cleared from the context * cache of the NPU. */ - rc = pnv_ocxl_spa_remove_pe(link->platform_data, pe_handle); + rc = pnv_ocxl_spa_remove_pe_from_cache(link->platform_data, pe_handle); WARN_ON(rc); pe_data = radix_tree_delete(>pe_tree, pe_handle);
Re: [PATCH v2 4/7] ocxl: Rename pnv_ocxl_spa_remove_pe to clarify it's action
Le 18/04/2018 à 03:08, Alastair D'Silva a écrit : From: Alastair D'Silva The function removes the process element from NPU cache. Signed-off-by: Alastair D'Silva --- Acked-by: Frederic Barrat arch/powerpc/include/asm/pnv-ocxl.h | 2 +- arch/powerpc/platforms/powernv/ocxl.c | 4 ++-- drivers/misc/ocxl/link.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h index f6945d3bc971..208b5503f4ed 100644 --- a/arch/powerpc/include/asm/pnv-ocxl.h +++ b/arch/powerpc/include/asm/pnv-ocxl.h @@ -28,7 +28,7 @@ extern int pnv_ocxl_map_xsl_regs(struct pci_dev *dev, void __iomem **dsisr, extern int pnv_ocxl_spa_setup(struct pci_dev *dev, void *spa_mem, int PE_mask, void **platform_data); extern void pnv_ocxl_spa_release(void *platform_data); -extern int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle); +extern int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle); extern int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr); extern void pnv_ocxl_free_xive_irq(u32 irq); diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c index fa9b53af3c7b..8c65aacda9c8 100644 --- a/arch/powerpc/platforms/powernv/ocxl.c +++ b/arch/powerpc/platforms/powernv/ocxl.c @@ -475,7 +475,7 @@ void pnv_ocxl_spa_release(void *platform_data) } EXPORT_SYMBOL_GPL(pnv_ocxl_spa_release); -int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle) +int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle) { struct spa_data *data = (struct spa_data *) platform_data; int rc; @@ -483,7 +483,7 @@ int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle) rc = opal_npu_spa_clear_cache(data->phb_opal_id, data->bdfn, pe_handle); return rc; } -EXPORT_SYMBOL_GPL(pnv_ocxl_spa_remove_pe); +EXPORT_SYMBOL_GPL(pnv_ocxl_spa_remove_pe_from_cache); int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr) { diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c index f30790582dc0..656e8610eec2 100644 --- a/drivers/misc/ocxl/link.c +++ b/drivers/misc/ocxl/link.c @@ -599,7 +599,7 @@ int ocxl_link_remove_pe(void *link_handle, int pasid) * On powerpc, the entry needs to be cleared from the context * cache of the NPU. */ - rc = pnv_ocxl_spa_remove_pe(link->platform_data, pe_handle); + rc = pnv_ocxl_spa_remove_pe_from_cache(link->platform_data, pe_handle); WARN_ON(rc); pe_data = radix_tree_delete(>pe_tree, pe_handle);
Re: [PATCH v2 3/7] powerpc: use task_pid_nr() for TID allocation
Le 18/04/2018 à 03:08, Alastair D'Silva a écrit : From: Alastair D'Silva <alast...@d-silva.org> The current implementation of TID allocation, using a global IDR, may result in an errant process starving the system of available TIDs. Instead, use task_pid_nr(), as mentioned by the original author. The scenario described which prevented it's use is not applicable, as set_thread_tidr can only be called after the task struct has been populated. Here is how I understand what's going to happen if 2 threads are using the same TIDR value, which is possible with this patch (if unlikely): 1. waking up the wrong thread is not really a problem, as threads have to handle spurious wake up from the 'wait' instruction anyway, and must be using some other condition to know when to loop around the 'wait' instruction. 2. missing the right thread: if the wrong thread is on a CPU, and a wake_host_thread/as_notify is sent, the core will see a matching thread and will accept the command. The (open)capi adapter won't send an interrupt. The wrong thread is awaken, which is not a problem as discussed above. As the right thread to notify is not running, no harm is done either: as soon as the thread runs, it's supposed to check its condition (which will be met) or call 'wait', but 'wait' immediately returns when called the first time after a thread is scheduled. So I believe we are ok. But I think it requires a huge comment with the above (at the minimum) :-) With a comment: Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> Fred Signed-off-by: Alastair D'Silva <alast...@d-silva.org> --- arch/powerpc/include/asm/switch_to.h | 1 - arch/powerpc/kernel/process.c| 97 +--- 2 files changed, 1 insertion(+), 97 deletions(-) diff --git a/arch/powerpc/include/asm/switch_to.h b/arch/powerpc/include/asm/switch_to.h index be8c9fa23983..5b03d8a82409 100644 --- a/arch/powerpc/include/asm/switch_to.h +++ b/arch/powerpc/include/asm/switch_to.h @@ -94,6 +94,5 @@ static inline void clear_task_ebb(struct task_struct *t) extern int set_thread_uses_vas(void); extern int set_thread_tidr(struct task_struct *t); -extern void clear_thread_tidr(struct task_struct *t); #endif /* _ASM_POWERPC_SWITCH_TO_H */ diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 3b00da47699b..87f047fd2762 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1496,103 +1496,12 @@ int set_thread_uses_vas(void) } #ifdef CONFIG_PPC64 -static DEFINE_SPINLOCK(vas_thread_id_lock); -static DEFINE_IDA(vas_thread_ida); - -/* - * We need to assign a unique thread id to each thread in a process. - * - * This thread id, referred to as TIDR, and separate from the Linux's tgid, - * is intended to be used to direct an ASB_Notify from the hardware to the - * thread, when a suitable event occurs in the system. - * - * One such event is a "paste" instruction in the context of Fast Thread - * Wakeup (aka Core-to-core wake up in the Virtual Accelerator Switchboard - * (VAS) in POWER9. - * - * To get a unique TIDR per process we could simply reuse task_pid_nr() but - * the problem is that task_pid_nr() is not yet available copy_thread() is - * called. Fixing that would require changing more intrusive arch-neutral - * code in code path in copy_process()?. - * - * Further, to assign unique TIDRs within each process, we need an atomic - * field (or an IDR) in task_struct, which again intrudes into the arch- - * neutral code. So try to assign globally unique TIDRs for now. - * - * NOTE: TIDR 0 indicates that the thread does not need a TIDR value. - * For now, only threads that expect to be notified by the VAS - * hardware need a TIDR value and we assign values > 0 for those. - */ -#define MAX_THREAD_CONTEXT ((1 << 16) - 1) -static int assign_thread_tidr(void) -{ - int index; - int err; - unsigned long flags; - -again: - if (!ida_pre_get(_thread_ida, GFP_KERNEL)) - return -ENOMEM; - - spin_lock_irqsave(_thread_id_lock, flags); - err = ida_get_new_above(_thread_ida, 1, ); - spin_unlock_irqrestore(_thread_id_lock, flags); - - if (err == -EAGAIN) - goto again; - else if (err) - return err; - - if (index > MAX_THREAD_CONTEXT) { - spin_lock_irqsave(_thread_id_lock, flags); - ida_remove(_thread_ida, index); - spin_unlock_irqrestore(_thread_id_lock, flags); - return -ENOMEM; - } - - return index; -} - -static void free_thread_tidr(int id) -{ - unsigned long flags; - - spin_lock_irqsave(_thread_id_lock, flags); - ida_remove(_thread_ida, id); - spin_unlock_irqrestore(_thread_id_lock, flags); -} - -/* - * Clear any TIDR value assigned to this thread. - */ -void clear_thread_tidr(struct task_struct *t) -{ - if
Re: [PATCH v2 3/7] powerpc: use task_pid_nr() for TID allocation
Le 18/04/2018 à 03:08, Alastair D'Silva a écrit : From: Alastair D'Silva The current implementation of TID allocation, using a global IDR, may result in an errant process starving the system of available TIDs. Instead, use task_pid_nr(), as mentioned by the original author. The scenario described which prevented it's use is not applicable, as set_thread_tidr can only be called after the task struct has been populated. Here is how I understand what's going to happen if 2 threads are using the same TIDR value, which is possible with this patch (if unlikely): 1. waking up the wrong thread is not really a problem, as threads have to handle spurious wake up from the 'wait' instruction anyway, and must be using some other condition to know when to loop around the 'wait' instruction. 2. missing the right thread: if the wrong thread is on a CPU, and a wake_host_thread/as_notify is sent, the core will see a matching thread and will accept the command. The (open)capi adapter won't send an interrupt. The wrong thread is awaken, which is not a problem as discussed above. As the right thread to notify is not running, no harm is done either: as soon as the thread runs, it's supposed to check its condition (which will be met) or call 'wait', but 'wait' immediately returns when called the first time after a thread is scheduled. So I believe we are ok. But I think it requires a huge comment with the above (at the minimum) :-) With a comment: Reviewed-by: Frederic Barrat Fred Signed-off-by: Alastair D'Silva --- arch/powerpc/include/asm/switch_to.h | 1 - arch/powerpc/kernel/process.c| 97 +--- 2 files changed, 1 insertion(+), 97 deletions(-) diff --git a/arch/powerpc/include/asm/switch_to.h b/arch/powerpc/include/asm/switch_to.h index be8c9fa23983..5b03d8a82409 100644 --- a/arch/powerpc/include/asm/switch_to.h +++ b/arch/powerpc/include/asm/switch_to.h @@ -94,6 +94,5 @@ static inline void clear_task_ebb(struct task_struct *t) extern int set_thread_uses_vas(void); extern int set_thread_tidr(struct task_struct *t); -extern void clear_thread_tidr(struct task_struct *t); #endif /* _ASM_POWERPC_SWITCH_TO_H */ diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 3b00da47699b..87f047fd2762 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1496,103 +1496,12 @@ int set_thread_uses_vas(void) } #ifdef CONFIG_PPC64 -static DEFINE_SPINLOCK(vas_thread_id_lock); -static DEFINE_IDA(vas_thread_ida); - -/* - * We need to assign a unique thread id to each thread in a process. - * - * This thread id, referred to as TIDR, and separate from the Linux's tgid, - * is intended to be used to direct an ASB_Notify from the hardware to the - * thread, when a suitable event occurs in the system. - * - * One such event is a "paste" instruction in the context of Fast Thread - * Wakeup (aka Core-to-core wake up in the Virtual Accelerator Switchboard - * (VAS) in POWER9. - * - * To get a unique TIDR per process we could simply reuse task_pid_nr() but - * the problem is that task_pid_nr() is not yet available copy_thread() is - * called. Fixing that would require changing more intrusive arch-neutral - * code in code path in copy_process()?. - * - * Further, to assign unique TIDRs within each process, we need an atomic - * field (or an IDR) in task_struct, which again intrudes into the arch- - * neutral code. So try to assign globally unique TIDRs for now. - * - * NOTE: TIDR 0 indicates that the thread does not need a TIDR value. - * For now, only threads that expect to be notified by the VAS - * hardware need a TIDR value and we assign values > 0 for those. - */ -#define MAX_THREAD_CONTEXT ((1 << 16) - 1) -static int assign_thread_tidr(void) -{ - int index; - int err; - unsigned long flags; - -again: - if (!ida_pre_get(_thread_ida, GFP_KERNEL)) - return -ENOMEM; - - spin_lock_irqsave(_thread_id_lock, flags); - err = ida_get_new_above(_thread_ida, 1, ); - spin_unlock_irqrestore(_thread_id_lock, flags); - - if (err == -EAGAIN) - goto again; - else if (err) - return err; - - if (index > MAX_THREAD_CONTEXT) { - spin_lock_irqsave(_thread_id_lock, flags); - ida_remove(_thread_ida, index); - spin_unlock_irqrestore(_thread_id_lock, flags); - return -ENOMEM; - } - - return index; -} - -static void free_thread_tidr(int id) -{ - unsigned long flags; - - spin_lock_irqsave(_thread_id_lock, flags); - ida_remove(_thread_ida, id); - spin_unlock_irqrestore(_thread_id_lock, flags); -} - -/* - * Clear any TIDR value assigned to this thread. - */ -void clear_thread_tidr(struct task_struct *t) -{ - if (!t->thread.tidr) - return; - - if (!cpu_ha
Re: [PATCH v2 2/7] powerpc: Use TIDR CPU feature to control TIDR allocation
Le 18/04/2018 à 03:08, Alastair D'Silva a écrit : From: Alastair D'Silva <alast...@d-silva.org> Switch the use of TIDR on it's CPU feature, rather than assuming it is available based on architecture. Signed-off-by: Alastair D'Silva <alast...@d-silva.org> --- Reviewed-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> arch/powerpc/kernel/process.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 1237f13fed51..3b00da47699b 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1154,7 +1154,7 @@ static inline void restore_sprs(struct thread_struct *old_thread, mtspr(SPRN_TAR, new_thread->tar); } - if (cpu_has_feature(CPU_FTR_ARCH_300) && + if (cpu_has_feature(CPU_FTR_P9_TIDR) && old_thread->tidr != new_thread->tidr) mtspr(SPRN_TIDR, new_thread->tidr); #endif @@ -1570,7 +1570,7 @@ void clear_thread_tidr(struct task_struct *t) if (!t->thread.tidr) return; - if (!cpu_has_feature(CPU_FTR_ARCH_300)) { + if (!cpu_has_feature(CPU_FTR_P9_TIDR)) { WARN_ON_ONCE(1); return; } @@ -1593,7 +1593,7 @@ int set_thread_tidr(struct task_struct *t) { int rc; - if (!cpu_has_feature(CPU_FTR_ARCH_300)) + if (!cpu_has_feature(CPU_FTR_P9_TIDR)) return -EINVAL; if (t != current)
Re: [PATCH v2 2/7] powerpc: Use TIDR CPU feature to control TIDR allocation
Le 18/04/2018 à 03:08, Alastair D'Silva a écrit : From: Alastair D'Silva Switch the use of TIDR on it's CPU feature, rather than assuming it is available based on architecture. Signed-off-by: Alastair D'Silva --- Reviewed-by: Frederic Barrat arch/powerpc/kernel/process.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 1237f13fed51..3b00da47699b 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1154,7 +1154,7 @@ static inline void restore_sprs(struct thread_struct *old_thread, mtspr(SPRN_TAR, new_thread->tar); } - if (cpu_has_feature(CPU_FTR_ARCH_300) && + if (cpu_has_feature(CPU_FTR_P9_TIDR) && old_thread->tidr != new_thread->tidr) mtspr(SPRN_TIDR, new_thread->tidr); #endif @@ -1570,7 +1570,7 @@ void clear_thread_tidr(struct task_struct *t) if (!t->thread.tidr) return; - if (!cpu_has_feature(CPU_FTR_ARCH_300)) { + if (!cpu_has_feature(CPU_FTR_P9_TIDR)) { WARN_ON_ONCE(1); return; } @@ -1593,7 +1593,7 @@ int set_thread_tidr(struct task_struct *t) { int rc; - if (!cpu_has_feature(CPU_FTR_ARCH_300)) + if (!cpu_has_feature(CPU_FTR_P9_TIDR)) return -EINVAL; if (t != current)
Re: [PATCH v2 1/7] powerpc: Add TIDR CPU feature for Power9
Le 18/04/2018 à 03:08, Alastair D'Silva a écrit : From: Alastair D'SilvaThis patch adds a CPU feature bit to show whether the CPU has the TIDR register available, enabling as_notify/wait in userspace. Signed-off-by: Alastair D'Silva --- arch/powerpc/include/asm/cputable.h | 3 ++- arch/powerpc/kernel/dt_cpu_ftrs.c | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h index 4e332f3531c5..54c4cbbe57b4 100644 --- a/arch/powerpc/include/asm/cputable.h +++ b/arch/powerpc/include/asm/cputable.h @@ -215,6 +215,7 @@ static inline void cpu_feature_keys_init(void) { } #define CPU_FTR_P9_TM_HV_ASSIST LONG_ASM_CONST(0x1000) #define CPU_FTR_P9_TM_XER_SO_BUG LONG_ASM_CONST(0x2000) #define CPU_FTR_P9_TLBIE_BUG LONG_ASM_CONST(0x4000) +#define CPU_FTR_P9_TIDR LONG_ASM_CONST(0x8000) #ifndef __ASSEMBLY__ @@ -462,7 +463,7 @@ static inline void cpu_feature_keys_init(void) { } CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \ CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \ CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_PKEY | \ - CPU_FTR_P9_TLBIE_BUG) + CPU_FTR_P9_TLBIE_BUG | CPU_FTR_P9_TIDR) #define CPU_FTRS_POWER9_DD1 ((CPU_FTRS_POWER9 | CPU_FTR_POWER9_DD1) & \ (~CPU_FTR_SAO)) #define CPU_FTRS_POWER9_DD2_0 CPU_FTRS_POWER9 diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c index 11a3a4fed3fb..10f8b7f55637 100644 --- a/arch/powerpc/kernel/dt_cpu_ftrs.c +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c @@ -722,6 +722,7 @@ static __init void cpufeatures_cpu_quirks(void) if ((version & 0x) == 0x004e) { cur_cpu_spec->cpu_features &= ~(CPU_FTR_DAWR); cur_cpu_spec->cpu_features |= CPU_FTR_P9_TLBIE_BUG; > + cur_cpu_spec->cpu_features |= CPU_FTR_P9_TIDR; Isn't it redundant with adding the flag to CPU_FTRS_POWER9? Fred } }
Re: [PATCH v2 1/7] powerpc: Add TIDR CPU feature for Power9
Le 18/04/2018 à 03:08, Alastair D'Silva a écrit : From: Alastair D'Silva This patch adds a CPU feature bit to show whether the CPU has the TIDR register available, enabling as_notify/wait in userspace. Signed-off-by: Alastair D'Silva --- arch/powerpc/include/asm/cputable.h | 3 ++- arch/powerpc/kernel/dt_cpu_ftrs.c | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h index 4e332f3531c5..54c4cbbe57b4 100644 --- a/arch/powerpc/include/asm/cputable.h +++ b/arch/powerpc/include/asm/cputable.h @@ -215,6 +215,7 @@ static inline void cpu_feature_keys_init(void) { } #define CPU_FTR_P9_TM_HV_ASSIST LONG_ASM_CONST(0x1000) #define CPU_FTR_P9_TM_XER_SO_BUG LONG_ASM_CONST(0x2000) #define CPU_FTR_P9_TLBIE_BUG LONG_ASM_CONST(0x4000) +#define CPU_FTR_P9_TIDR LONG_ASM_CONST(0x8000) #ifndef __ASSEMBLY__ @@ -462,7 +463,7 @@ static inline void cpu_feature_keys_init(void) { } CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \ CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \ CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_PKEY | \ - CPU_FTR_P9_TLBIE_BUG) + CPU_FTR_P9_TLBIE_BUG | CPU_FTR_P9_TIDR) #define CPU_FTRS_POWER9_DD1 ((CPU_FTRS_POWER9 | CPU_FTR_POWER9_DD1) & \ (~CPU_FTR_SAO)) #define CPU_FTRS_POWER9_DD2_0 CPU_FTRS_POWER9 diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c index 11a3a4fed3fb..10f8b7f55637 100644 --- a/arch/powerpc/kernel/dt_cpu_ftrs.c +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c @@ -722,6 +722,7 @@ static __init void cpufeatures_cpu_quirks(void) if ((version & 0x) == 0x004e) { cur_cpu_spec->cpu_features &= ~(CPU_FTR_DAWR); cur_cpu_spec->cpu_features |= CPU_FTR_P9_TLBIE_BUG; > + cur_cpu_spec->cpu_features |= CPU_FTR_P9_TIDR; Isn't it redundant with adding the flag to CPU_FTRS_POWER9? Fred } }
Re: [PATCH 2/2] misc: ocxl: use put_device() instead of device_unregister()
Le 12/03/2018 à 12:36, Arvind Yadav a écrit : if device_register() returned an error! Always use put_device() to give up the reference initialized. Signed-off-by: Arvind Yadav <arvind.yadav...@gmail.com> --- OK, device_unregister() calls put_device() but also other actions that we can skip in this case. Acked-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> drivers/misc/ocxl/pci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/ocxl/pci.c b/drivers/misc/ocxl/pci.c index 0051d9e..21f4254 100644 --- a/drivers/misc/ocxl/pci.c +++ b/drivers/misc/ocxl/pci.c @@ -519,7 +519,7 @@ static struct ocxl_fn *init_function(struct pci_dev *dev) rc = device_register(>dev); if (rc) { deconfigure_function(fn); - device_unregister(>dev); + put_device(>dev); return ERR_PTR(rc); } return fn;
Re: [PATCH 2/2] misc: ocxl: use put_device() instead of device_unregister()
Le 12/03/2018 à 12:36, Arvind Yadav a écrit : if device_register() returned an error! Always use put_device() to give up the reference initialized. Signed-off-by: Arvind Yadav --- OK, device_unregister() calls put_device() but also other actions that we can skip in this case. Acked-by: Frederic Barrat drivers/misc/ocxl/pci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/ocxl/pci.c b/drivers/misc/ocxl/pci.c index 0051d9e..21f4254 100644 --- a/drivers/misc/ocxl/pci.c +++ b/drivers/misc/ocxl/pci.c @@ -519,7 +519,7 @@ static struct ocxl_fn *init_function(struct pci_dev *dev) rc = device_register(>dev); if (rc) { deconfigure_function(fn); - device_unregister(>dev); + put_device(>dev); return ERR_PTR(rc); } return fn;
Re: [PATCH v3 2/2] ocxl: Document the OCXL_IOCTL_GET_METADATA IOCTL
Le 22/02/2018 à 05:17, Alastair D'Silva a écrit : From: Alastair D'Silva <alast...@d-silva.org> Signed-off-by: Alastair D'Silva <alast...@d-silva.org> --- Acked-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> Documentation/accelerators/ocxl.rst | 5 + 1 file changed, 5 insertions(+) diff --git a/Documentation/accelerators/ocxl.rst b/Documentation/accelerators/ocxl.rst index 4f7af841d935..ddcc58d01cfb 100644 --- a/Documentation/accelerators/ocxl.rst +++ b/Documentation/accelerators/ocxl.rst @@ -152,6 +152,11 @@ OCXL_IOCTL_IRQ_SET_FD: Associate an event fd to an AFU interrupt so that the user process can be notified when the AFU sends an interrupt. +OCXL_IOCTL_GET_METADATA: + + Obtains configuration information from the card, such at the size of + MMIO areas, the AFU version, and the PASID for the current context. + mmap
Re: [PATCH v3 2/2] ocxl: Document the OCXL_IOCTL_GET_METADATA IOCTL
Le 22/02/2018 à 05:17, Alastair D'Silva a écrit : From: Alastair D'Silva Signed-off-by: Alastair D'Silva --- Acked-by: Frederic Barrat Documentation/accelerators/ocxl.rst | 5 + 1 file changed, 5 insertions(+) diff --git a/Documentation/accelerators/ocxl.rst b/Documentation/accelerators/ocxl.rst index 4f7af841d935..ddcc58d01cfb 100644 --- a/Documentation/accelerators/ocxl.rst +++ b/Documentation/accelerators/ocxl.rst @@ -152,6 +152,11 @@ OCXL_IOCTL_IRQ_SET_FD: Associate an event fd to an AFU interrupt so that the user process can be notified when the AFU sends an interrupt. +OCXL_IOCTL_GET_METADATA: + + Obtains configuration information from the card, such at the size of + MMIO areas, the AFU version, and the PASID for the current context. + mmap
Re: [PATCH v3 1/2] ocxl: Add get_metadata IOCTL to share OCXL information to userspace
Le 22/02/2018 à 05:17, Alastair D'Silva a écrit : From: Alastair D'Silva <alast...@d-silva.org> Some required information is not exposed to userspace currently (eg. the PASID), pass this information back, along with other information which is currently communicated via sysfs, which saves some parsing effort in userspace. Signed-off-by: Alastair D'Silva <alast...@d-silva.org> --- Thanks! Acked-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> drivers/misc/ocxl/file.c | 27 +++ include/uapi/misc/ocxl.h | 17 + 2 files changed, 44 insertions(+) diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index d9aa407db06a..90df1be5ef3f 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -102,10 +102,32 @@ static long afu_ioctl_attach(struct ocxl_context *ctx, return rc; } +static long afu_ioctl_get_metadata(struct ocxl_context *ctx, + struct ocxl_ioctl_metadata __user *uarg) +{ + struct ocxl_ioctl_metadata arg; + + memset(, 0, sizeof(arg)); + + arg.version = 0; + + arg.afu_version_major = ctx->afu->config.version_major; + arg.afu_version_minor = ctx->afu->config.version_minor; + arg.pasid = ctx->pasid; + arg.pp_mmio_size = ctx->afu->config.pp_mmio_stride; + arg.global_mmio_size = ctx->afu->config.global_mmio_size; + + if (copy_to_user(uarg, , sizeof(arg))) + return -EFAULT; + + return 0; +} + #define CMD_STR(x) (x == OCXL_IOCTL_ATTACH ? "ATTACH" : \ x == OCXL_IOCTL_IRQ_ALLOC ? "IRQ_ALLOC" : \ x == OCXL_IOCTL_IRQ_FREE ? "IRQ_FREE" : \ x == OCXL_IOCTL_IRQ_SET_FD ? "IRQ_SET_FD" : \ + x == OCXL_IOCTL_GET_METADATA ? "GET_METADATA" : \ "UNKNOWN") static long afu_ioctl(struct file *file, unsigned int cmd, @@ -157,6 +179,11 @@ static long afu_ioctl(struct file *file, unsigned int cmd, irq_fd.eventfd); break; + case OCXL_IOCTL_GET_METADATA: + rc = afu_ioctl_get_metadata(ctx, + (struct ocxl_ioctl_metadata __user *) args); + break; + default: rc = -EINVAL; } diff --git a/include/uapi/misc/ocxl.h b/include/uapi/misc/ocxl.h index 4b0b0b756f3e..0af83d80fb3e 100644 --- a/include/uapi/misc/ocxl.h +++ b/include/uapi/misc/ocxl.h @@ -32,6 +32,22 @@ struct ocxl_ioctl_attach { __u64 reserved3; }; +struct ocxl_ioctl_metadata { + __u16 version; // struct version, always backwards compatible + + // Version 0 fields + __u8 afu_version_major; + __u8 afu_version_minor; + __u32 pasid;// PASID assigned to the current context + + __u64 pp_mmio_size; // Per PASID MMIO size + __u64 global_mmio_size; + + // End version 0 fields + + __u64 reserved[13]; // Total of 16*u64 +}; + struct ocxl_ioctl_irq_fd { __u64 irq_offset; __s32 eventfd; @@ -45,5 +61,6 @@ struct ocxl_ioctl_irq_fd { #define OCXL_IOCTL_IRQ_ALLOC _IOR(OCXL_MAGIC, 0x11, __u64) #define OCXL_IOCTL_IRQ_FREE _IOW(OCXL_MAGIC, 0x12, __u64) #define OCXL_IOCTL_IRQ_SET_FD _IOW(OCXL_MAGIC, 0x13, struct ocxl_ioctl_irq_fd) +#define OCXL_IOCTL_GET_METADATA _IOR(OCXL_MAGIC, 0x14, struct ocxl_ioctl_metadata) #endif /* _UAPI_MISC_OCXL_H */
Re: [PATCH v3 1/2] ocxl: Add get_metadata IOCTL to share OCXL information to userspace
Le 22/02/2018 à 05:17, Alastair D'Silva a écrit : From: Alastair D'Silva Some required information is not exposed to userspace currently (eg. the PASID), pass this information back, along with other information which is currently communicated via sysfs, which saves some parsing effort in userspace. Signed-off-by: Alastair D'Silva --- Thanks! Acked-by: Frederic Barrat drivers/misc/ocxl/file.c | 27 +++ include/uapi/misc/ocxl.h | 17 + 2 files changed, 44 insertions(+) diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index d9aa407db06a..90df1be5ef3f 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -102,10 +102,32 @@ static long afu_ioctl_attach(struct ocxl_context *ctx, return rc; } +static long afu_ioctl_get_metadata(struct ocxl_context *ctx, + struct ocxl_ioctl_metadata __user *uarg) +{ + struct ocxl_ioctl_metadata arg; + + memset(, 0, sizeof(arg)); + + arg.version = 0; + + arg.afu_version_major = ctx->afu->config.version_major; + arg.afu_version_minor = ctx->afu->config.version_minor; + arg.pasid = ctx->pasid; + arg.pp_mmio_size = ctx->afu->config.pp_mmio_stride; + arg.global_mmio_size = ctx->afu->config.global_mmio_size; + + if (copy_to_user(uarg, , sizeof(arg))) + return -EFAULT; + + return 0; +} + #define CMD_STR(x) (x == OCXL_IOCTL_ATTACH ? "ATTACH" : \ x == OCXL_IOCTL_IRQ_ALLOC ? "IRQ_ALLOC" : \ x == OCXL_IOCTL_IRQ_FREE ? "IRQ_FREE" : \ x == OCXL_IOCTL_IRQ_SET_FD ? "IRQ_SET_FD" : \ + x == OCXL_IOCTL_GET_METADATA ? "GET_METADATA" : \ "UNKNOWN") static long afu_ioctl(struct file *file, unsigned int cmd, @@ -157,6 +179,11 @@ static long afu_ioctl(struct file *file, unsigned int cmd, irq_fd.eventfd); break; + case OCXL_IOCTL_GET_METADATA: + rc = afu_ioctl_get_metadata(ctx, + (struct ocxl_ioctl_metadata __user *) args); + break; + default: rc = -EINVAL; } diff --git a/include/uapi/misc/ocxl.h b/include/uapi/misc/ocxl.h index 4b0b0b756f3e..0af83d80fb3e 100644 --- a/include/uapi/misc/ocxl.h +++ b/include/uapi/misc/ocxl.h @@ -32,6 +32,22 @@ struct ocxl_ioctl_attach { __u64 reserved3; }; +struct ocxl_ioctl_metadata { + __u16 version; // struct version, always backwards compatible + + // Version 0 fields + __u8 afu_version_major; + __u8 afu_version_minor; + __u32 pasid;// PASID assigned to the current context + + __u64 pp_mmio_size; // Per PASID MMIO size + __u64 global_mmio_size; + + // End version 0 fields + + __u64 reserved[13]; // Total of 16*u64 +}; + struct ocxl_ioctl_irq_fd { __u64 irq_offset; __s32 eventfd; @@ -45,5 +61,6 @@ struct ocxl_ioctl_irq_fd { #define OCXL_IOCTL_IRQ_ALLOC _IOR(OCXL_MAGIC, 0x11, __u64) #define OCXL_IOCTL_IRQ_FREE _IOW(OCXL_MAGIC, 0x12, __u64) #define OCXL_IOCTL_IRQ_SET_FD _IOW(OCXL_MAGIC, 0x13, struct ocxl_ioctl_irq_fd) +#define OCXL_IOCTL_GET_METADATA _IOR(OCXL_MAGIC, 0x14, struct ocxl_ioctl_metadata) #endif /* _UAPI_MISC_OCXL_H */
Re: [PATCH] ocxl: Add get_metadata IOCTL to share OCXL information to userspace
Le 21/02/2018 à 07:43, Balbir Singh a écrit : On Wed, Feb 21, 2018 at 3:57 PM, Alastair D'Silvawrote: From: Alastair D'Silva Some required information is not exposed to userspace currently (eg. the PASID), pass this information back, along with other information which is currently communicated via sysfs, which saves some parsing effort in userspace. Signed-off-by: Alastair D'Silva --- drivers/misc/ocxl/file.c | 27 +++ include/uapi/misc/ocxl.h | 22 ++ 2 files changed, 49 insertions(+) diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index d9aa407db06a..11514a8444e5 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -102,10 +102,32 @@ static long afu_ioctl_attach(struct ocxl_context *ctx, return rc; } +static long afu_ioctl_get_metadata(struct ocxl_context *ctx, + struct ocxl_ioctl_get_metadata __user *uarg) Why do we call this metadata? Isn't this an afu_descriptor? +{ + struct ocxl_ioctl_get_metadata arg; + + memset(, 0, sizeof(arg)); + + arg.version = 0; Does it make sense to have version 0? Even if does, you can afford to skip initialization due to the memset above. I prefer that versions start with 1 + + arg.afu_version_major = ctx->afu->config.version_major; + arg.afu_version_minor = ctx->afu->config.version_minor; + arg.pasid = ctx->pasid; + arg.pp_mmio_size = ctx->afu->config.pp_mmio_stride; + arg.global_mmio_size = ctx->afu->config.global_mmio_size; + + if (copy_to_user(uarg, , sizeof(arg))) + return -EFAULT; + + return 0; +} + #define CMD_STR(x) (x == OCXL_IOCTL_ATTACH ? "ATTACH" : \ x == OCXL_IOCTL_IRQ_ALLOC ? "IRQ_ALLOC" : \ x == OCXL_IOCTL_IRQ_FREE ? "IRQ_FREE" : \ x == OCXL_IOCTL_IRQ_SET_FD ? "IRQ_SET_FD" : \ + x == OCXL_IOCTL_GET_METADATA ? "GET_METADATA" : \ "UNKNOWN") static long afu_ioctl(struct file *file, unsigned int cmd, @@ -157,6 +179,11 @@ static long afu_ioctl(struct file *file, unsigned int cmd, irq_fd.eventfd); break; + case OCXL_IOCTL_GET_METADATA: + rc = afu_ioctl_get_metadata(ctx, + (struct ocxl_ioctl_get_metadata __user *) args); + break; + default: rc = -EINVAL; } diff --git a/include/uapi/misc/ocxl.h b/include/uapi/misc/ocxl.h index 4b0b0b756f3e..16e1f48ce280 100644 --- a/include/uapi/misc/ocxl.h +++ b/include/uapi/misc/ocxl.h @@ -32,6 +32,27 @@ struct ocxl_ioctl_attach { __u64 reserved3; }; +/* + * Version contains the version of the struct. + * Versions will always be backwards compatible, that is, new versions will not + * alter existing fields + */ +struct ocxl_ioctl_get_metadata { This sounds more like a function name, do we need it to be _get_metdata? + __u16 version; + + // Version 0 fields + __u8 afu_version_major; + __u8 afu_version_minor; + __u32 pasid; + + __u64 pp_mmio_size; + __u64 global_mmio_size; + Should we document the fields? pp_ stands for per process, but is not very clear at first look. Why do we care to return only the size, what about lpc size? My bad, I forgot to mention it before. There's a somewhat high-level description which needs updating in: Documentation/accelerators/ocxl.rst It doesn't go down to the level of the structure members, but at least all ioctl commands should have a brief description. lpc_size could be added. It's currently useless to the library, but doesn't hurt. The one which was giving me troubles on a previous version of this patch was the lpc numa node ID, since that was experimental code and felt out of place considering what's been upstreamed in skiboot and linux so far. Fred + // End version 0 fields + + __u64 reserved[13]; // Total of 16*u64 +}; Balbir Singh.
Re: [PATCH] ocxl: Add get_metadata IOCTL to share OCXL information to userspace
Le 21/02/2018 à 07:43, Balbir Singh a écrit : On Wed, Feb 21, 2018 at 3:57 PM, Alastair D'Silva wrote: From: Alastair D'Silva Some required information is not exposed to userspace currently (eg. the PASID), pass this information back, along with other information which is currently communicated via sysfs, which saves some parsing effort in userspace. Signed-off-by: Alastair D'Silva --- drivers/misc/ocxl/file.c | 27 +++ include/uapi/misc/ocxl.h | 22 ++ 2 files changed, 49 insertions(+) diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index d9aa407db06a..11514a8444e5 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -102,10 +102,32 @@ static long afu_ioctl_attach(struct ocxl_context *ctx, return rc; } +static long afu_ioctl_get_metadata(struct ocxl_context *ctx, + struct ocxl_ioctl_get_metadata __user *uarg) Why do we call this metadata? Isn't this an afu_descriptor? +{ + struct ocxl_ioctl_get_metadata arg; + + memset(, 0, sizeof(arg)); + + arg.version = 0; Does it make sense to have version 0? Even if does, you can afford to skip initialization due to the memset above. I prefer that versions start with 1 + + arg.afu_version_major = ctx->afu->config.version_major; + arg.afu_version_minor = ctx->afu->config.version_minor; + arg.pasid = ctx->pasid; + arg.pp_mmio_size = ctx->afu->config.pp_mmio_stride; + arg.global_mmio_size = ctx->afu->config.global_mmio_size; + + if (copy_to_user(uarg, , sizeof(arg))) + return -EFAULT; + + return 0; +} + #define CMD_STR(x) (x == OCXL_IOCTL_ATTACH ? "ATTACH" : \ x == OCXL_IOCTL_IRQ_ALLOC ? "IRQ_ALLOC" : \ x == OCXL_IOCTL_IRQ_FREE ? "IRQ_FREE" : \ x == OCXL_IOCTL_IRQ_SET_FD ? "IRQ_SET_FD" : \ + x == OCXL_IOCTL_GET_METADATA ? "GET_METADATA" : \ "UNKNOWN") static long afu_ioctl(struct file *file, unsigned int cmd, @@ -157,6 +179,11 @@ static long afu_ioctl(struct file *file, unsigned int cmd, irq_fd.eventfd); break; + case OCXL_IOCTL_GET_METADATA: + rc = afu_ioctl_get_metadata(ctx, + (struct ocxl_ioctl_get_metadata __user *) args); + break; + default: rc = -EINVAL; } diff --git a/include/uapi/misc/ocxl.h b/include/uapi/misc/ocxl.h index 4b0b0b756f3e..16e1f48ce280 100644 --- a/include/uapi/misc/ocxl.h +++ b/include/uapi/misc/ocxl.h @@ -32,6 +32,27 @@ struct ocxl_ioctl_attach { __u64 reserved3; }; +/* + * Version contains the version of the struct. + * Versions will always be backwards compatible, that is, new versions will not + * alter existing fields + */ +struct ocxl_ioctl_get_metadata { This sounds more like a function name, do we need it to be _get_metdata? + __u16 version; + + // Version 0 fields + __u8 afu_version_major; + __u8 afu_version_minor; + __u32 pasid; + + __u64 pp_mmio_size; + __u64 global_mmio_size; + Should we document the fields? pp_ stands for per process, but is not very clear at first look. Why do we care to return only the size, what about lpc size? My bad, I forgot to mention it before. There's a somewhat high-level description which needs updating in: Documentation/accelerators/ocxl.rst It doesn't go down to the level of the structure members, but at least all ioctl commands should have a brief description. lpc_size could be added. It's currently useless to the library, but doesn't hurt. The one which was giving me troubles on a previous version of this patch was the lpc numa node ID, since that was experimental code and felt out of place considering what's been upstreamed in skiboot and linux so far. Fred + // End version 0 fields + + __u64 reserved[13]; // Total of 16*u64 +}; Balbir Singh.
Re: linux-4.16-rc1/drivers/misc/ocxl/file.c:320:broken error checking ?
Le 12/02/2018 à 09:58, David Binderman a écrit : Hello there, linux-4.16-rc1/drivers/misc/ocxl/file.c:320]: (style) Checking if unsigned variable 'used' is less than zero. Source code is used = append_xsl_error(ctx, , buf + sizeof(header)); if (used < 0) return used; Suggest put return value from function into signed variable, sanity check it, then assign it to an unsigned variable. Also, use of the gcc compiler flag -Wtype-limits will show up this kind of problem in future. Thanks for reporting it. A patch to address it is working its way up and should land in the next rc release. Fred Regards David Binderman
Re: linux-4.16-rc1/drivers/misc/ocxl/file.c:320:broken error checking ?
Le 12/02/2018 à 09:58, David Binderman a écrit : Hello there, linux-4.16-rc1/drivers/misc/ocxl/file.c:320]: (style) Checking if unsigned variable 'used' is less than zero. Source code is used = append_xsl_error(ctx, , buf + sizeof(header)); if (used < 0) return used; Suggest put return value from function into signed variable, sanity check it, then assign it to an unsigned variable. Also, use of the gcc compiler flag -Wtype-limits will show up this kind of problem in future. Thanks for reporting it. A patch to address it is working its way up and should land in the next rc release. Fred Regards David Binderman
Re: [PATCH] ocxl: fix signed comparison with less than zero
Le 30/01/2018 à 16:11, Colin King a écrit : From: Colin Ian King <colin.k...@canonical.com> Currently the comparison of used < 0 is always false because uses is a size_t. Fix this by making used a ssize_t type. Detected by Coccinelle: drivers/misc/ocxl/file.c:320:6-10: WARNING: Unsigned expression compared with zero: used < 0 Fixes: 5ef3166e8a32 ("ocxl: Driver code for 'generic' opencapi devices") Signed-off-by: Colin Ian King <colin.k...@canonical.com> --- Thanks! Acked-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> drivers/misc/ocxl/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index c90c1a578d2f..1287e4430e6b 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -277,7 +277,7 @@ static ssize_t afu_read(struct file *file, char __user *buf, size_t count, struct ocxl_context *ctx = file->private_data; struct ocxl_kernel_event_header header; ssize_t rc; - size_t used = 0; + ssize_t used = 0; DEFINE_WAIT(event_wait); memset(, 0, sizeof(header));
Re: [PATCH] ocxl: fix signed comparison with less than zero
Le 30/01/2018 à 16:11, Colin King a écrit : From: Colin Ian King Currently the comparison of used < 0 is always false because uses is a size_t. Fix this by making used a ssize_t type. Detected by Coccinelle: drivers/misc/ocxl/file.c:320:6-10: WARNING: Unsigned expression compared with zero: used < 0 Fixes: 5ef3166e8a32 ("ocxl: Driver code for 'generic' opencapi devices") Signed-off-by: Colin Ian King --- Thanks! Acked-by: Frederic Barrat drivers/misc/ocxl/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c index c90c1a578d2f..1287e4430e6b 100644 --- a/drivers/misc/ocxl/file.c +++ b/drivers/misc/ocxl/file.c @@ -277,7 +277,7 @@ static ssize_t afu_read(struct file *file, char __user *buf, size_t count, struct ocxl_context *ctx = file->private_data; struct ocxl_kernel_event_header header; ssize_t rc; - size_t used = 0; + ssize_t used = 0; DEFINE_WAIT(event_wait); memset(, 0, sizeof(header));
Re: [PATCH v2 12/13] ocxl: Documentation
Le 25/01/2018 à 14:17, Greg KH a écrit : On Tue, Jan 23, 2018 at 12:31:47PM +0100, Frederic Barrat wrote: ocxl.rst gives a quick, high-level view of opencapi. Update ioctl-number.txt to reflect ioctl numbers being used by the ocxl driver Signed-off-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- Documentation/ABI/testing/sysfs-class-ocxl | 35 +++ Documentation/accelerators/ocxl.rst| 160 + Documentation/ioctl/ioctl-number.txt | 1 + 3 files changed, 196 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-class-ocxl create mode 100644 Documentation/accelerators/ocxl.rst diff --git a/Documentation/ABI/testing/sysfs-class-ocxl b/Documentation/ABI/testing/sysfs-class-ocxl new file mode 100644 index ..ac11deb71235 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-class-ocxl @@ -0,0 +1,35 @@ +What: /sys/class/ocxl//afu_version +Date: January 2018 +Contact:linuxppc-...@lists.ozlabs.org +Description:read only +Version of the AFU, in the format : + Reflects what is read in the configuration space of the AFU Odd mix of tabs and spaces in this file, please just use tabs. Oops! Will fix. Fred thanks, greg k-h
Re: [PATCH v2 12/13] ocxl: Documentation
Le 25/01/2018 à 14:17, Greg KH a écrit : On Tue, Jan 23, 2018 at 12:31:47PM +0100, Frederic Barrat wrote: ocxl.rst gives a quick, high-level view of opencapi. Update ioctl-number.txt to reflect ioctl numbers being used by the ocxl driver Signed-off-by: Frederic Barrat --- Documentation/ABI/testing/sysfs-class-ocxl | 35 +++ Documentation/accelerators/ocxl.rst| 160 + Documentation/ioctl/ioctl-number.txt | 1 + 3 files changed, 196 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-class-ocxl create mode 100644 Documentation/accelerators/ocxl.rst diff --git a/Documentation/ABI/testing/sysfs-class-ocxl b/Documentation/ABI/testing/sysfs-class-ocxl new file mode 100644 index ..ac11deb71235 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-class-ocxl @@ -0,0 +1,35 @@ +What: /sys/class/ocxl//afu_version +Date: January 2018 +Contact:linuxppc-...@lists.ozlabs.org +Description:read only +Version of the AFU, in the format : + Reflects what is read in the configuration space of the AFU Odd mix of tabs and spaces in this file, please just use tabs. Oops! Will fix. Fred thanks, greg k-h
[PATCH v2 03/13] powerpc/powernv: Add opal calls for opencapi
Add opal calls to interact with the NPU: OPAL_NPU_SPA_SETUP: set the Shared Process Area (SPA) The SPA is a table containing one entry (Process Element) per memory context which can be accessed by the opencapi device. OPAL_NPU_SPA_CLEAR_CACHE: clear the context cache The NPU keeps a cache of recently accessed memory contexts. When a Process Element is removed from the SPA, the cache for the link must be cleared. OPAL_NPU_TL_SET: configure the Transaction Layer The Transaction Layer specification defines several templates for messages to be exchanged on the link. During link setup, the host and device must negotiate what templates are supported on both sides and at what rates those messages can be sent. Signed-off-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> Acked-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> --- arch/powerpc/include/asm/opal-api.h| 5 - arch/powerpc/include/asm/opal.h| 6 ++ arch/powerpc/platforms/powernv/opal-wrappers.S | 3 +++ 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h index 233c7504b1f2..24c73f5575ee 100644 --- a/arch/powerpc/include/asm/opal-api.h +++ b/arch/powerpc/include/asm/opal-api.h @@ -201,7 +201,10 @@ #define OPAL_SET_POWER_SHIFT_RATIO 155 #define OPAL_SENSOR_GROUP_CLEAR156 #define OPAL_PCI_SET_P2P 157 -#define OPAL_LAST 157 +#define OPAL_NPU_SPA_SETUP 159 +#define OPAL_NPU_SPA_CLEAR_CACHE 160 +#define OPAL_NPU_TL_SET161 +#define OPAL_LAST 161 /* Device tree flags */ diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h index 0c545f7fc77b..12e70fb58700 100644 --- a/arch/powerpc/include/asm/opal.h +++ b/arch/powerpc/include/asm/opal.h @@ -34,6 +34,12 @@ int64_t opal_npu_init_context(uint64_t phb_id, int pasid, uint64_t msr, uint64_t bdf); int64_t opal_npu_map_lpar(uint64_t phb_id, uint64_t bdf, uint64_t lparid, uint64_t lpcr); +int64_t opal_npu_spa_setup(uint64_t phb_id, uint32_t bdfn, + uint64_t addr, uint64_t PE_mask); +int64_t opal_npu_spa_clear_cache(uint64_t phb_id, uint32_t bdfn, + uint64_t PE_handle); +int64_t opal_npu_tl_set(uint64_t phb_id, uint32_t bdfn, long cap, + uint64_t rate_phys, uint32_t size); int64_t opal_console_write(int64_t term_number, __be64 *length, const uint8_t *buffer); int64_t opal_console_read(int64_t term_number, __be64 *length, diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S index 6f4b00a2ac46..1b2936ba6040 100644 --- a/arch/powerpc/platforms/powernv/opal-wrappers.S +++ b/arch/powerpc/platforms/powernv/opal-wrappers.S @@ -320,3 +320,6 @@ OPAL_CALL(opal_set_powercap, OPAL_SET_POWERCAP); OPAL_CALL(opal_get_power_shift_ratio, OPAL_GET_POWER_SHIFT_RATIO); OPAL_CALL(opal_set_power_shift_ratio, OPAL_SET_POWER_SHIFT_RATIO); OPAL_CALL(opal_sensor_group_clear, OPAL_SENSOR_GROUP_CLEAR); +OPAL_CALL(opal_npu_spa_setup, OPAL_NPU_SPA_SETUP); +OPAL_CALL(opal_npu_spa_clear_cache,OPAL_NPU_SPA_CLEAR_CACHE); +OPAL_CALL(opal_npu_tl_set, OPAL_NPU_TL_SET); -- 2.14.1
[PATCH v2 02/13] powerpc/powernv: Set correct configuration space size for opencapi devices
From: Andrew Donnellan <andrew.donnel...@au1.ibm.com> The configuration space for opencapi devices doesn't have a PCI Express capability, therefore confusing linux in thinking it's of an old PCI type with a 256-byte configuration space size, instead of the desired 4k. So add a PCI fixup to declare the correct size. Signed-off-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Signed-off-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- arch/powerpc/platforms/powernv/pci-ioda.c | 13 + 1 file changed, 13 insertions(+) diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index e780263a14ee..d5af700820f3 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -4080,6 +4080,19 @@ void __init pnv_pci_init_npu2_opencapi_phb(struct device_node *np) pnv_pci_init_ioda_phb(np, 0, PNV_PHB_NPU_OCAPI); } +static void pnv_npu2_opencapi_cfg_size_fixup(struct pci_dev *dev) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + + if (!machine_is(powernv)) + return; + + if (phb->type == PNV_PHB_NPU_OCAPI) + dev->cfg_size = PCI_CFG_SPACE_EXP_SIZE; +} +DECLARE_PCI_FIXUP_EARLY(PCI_ANY_ID, PCI_ANY_ID, pnv_npu2_opencapi_cfg_size_fixup); + void __init pnv_pci_init_ioda_hub(struct device_node *np) { struct device_node *phbn; -- 2.14.1
[PATCH v2 03/13] powerpc/powernv: Add opal calls for opencapi
Add opal calls to interact with the NPU: OPAL_NPU_SPA_SETUP: set the Shared Process Area (SPA) The SPA is a table containing one entry (Process Element) per memory context which can be accessed by the opencapi device. OPAL_NPU_SPA_CLEAR_CACHE: clear the context cache The NPU keeps a cache of recently accessed memory contexts. When a Process Element is removed from the SPA, the cache for the link must be cleared. OPAL_NPU_TL_SET: configure the Transaction Layer The Transaction Layer specification defines several templates for messages to be exchanged on the link. During link setup, the host and device must negotiate what templates are supported on both sides and at what rates those messages can be sent. Signed-off-by: Frederic Barrat Acked-by: Andrew Donnellan --- arch/powerpc/include/asm/opal-api.h| 5 - arch/powerpc/include/asm/opal.h| 6 ++ arch/powerpc/platforms/powernv/opal-wrappers.S | 3 +++ 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h index 233c7504b1f2..24c73f5575ee 100644 --- a/arch/powerpc/include/asm/opal-api.h +++ b/arch/powerpc/include/asm/opal-api.h @@ -201,7 +201,10 @@ #define OPAL_SET_POWER_SHIFT_RATIO 155 #define OPAL_SENSOR_GROUP_CLEAR156 #define OPAL_PCI_SET_P2P 157 -#define OPAL_LAST 157 +#define OPAL_NPU_SPA_SETUP 159 +#define OPAL_NPU_SPA_CLEAR_CACHE 160 +#define OPAL_NPU_TL_SET161 +#define OPAL_LAST 161 /* Device tree flags */ diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h index 0c545f7fc77b..12e70fb58700 100644 --- a/arch/powerpc/include/asm/opal.h +++ b/arch/powerpc/include/asm/opal.h @@ -34,6 +34,12 @@ int64_t opal_npu_init_context(uint64_t phb_id, int pasid, uint64_t msr, uint64_t bdf); int64_t opal_npu_map_lpar(uint64_t phb_id, uint64_t bdf, uint64_t lparid, uint64_t lpcr); +int64_t opal_npu_spa_setup(uint64_t phb_id, uint32_t bdfn, + uint64_t addr, uint64_t PE_mask); +int64_t opal_npu_spa_clear_cache(uint64_t phb_id, uint32_t bdfn, + uint64_t PE_handle); +int64_t opal_npu_tl_set(uint64_t phb_id, uint32_t bdfn, long cap, + uint64_t rate_phys, uint32_t size); int64_t opal_console_write(int64_t term_number, __be64 *length, const uint8_t *buffer); int64_t opal_console_read(int64_t term_number, __be64 *length, diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S index 6f4b00a2ac46..1b2936ba6040 100644 --- a/arch/powerpc/platforms/powernv/opal-wrappers.S +++ b/arch/powerpc/platforms/powernv/opal-wrappers.S @@ -320,3 +320,6 @@ OPAL_CALL(opal_set_powercap, OPAL_SET_POWERCAP); OPAL_CALL(opal_get_power_shift_ratio, OPAL_GET_POWER_SHIFT_RATIO); OPAL_CALL(opal_set_power_shift_ratio, OPAL_SET_POWER_SHIFT_RATIO); OPAL_CALL(opal_sensor_group_clear, OPAL_SENSOR_GROUP_CLEAR); +OPAL_CALL(opal_npu_spa_setup, OPAL_NPU_SPA_SETUP); +OPAL_CALL(opal_npu_spa_clear_cache,OPAL_NPU_SPA_CLEAR_CACHE); +OPAL_CALL(opal_npu_tl_set, OPAL_NPU_TL_SET); -- 2.14.1
[PATCH v2 02/13] powerpc/powernv: Set correct configuration space size for opencapi devices
From: Andrew Donnellan The configuration space for opencapi devices doesn't have a PCI Express capability, therefore confusing linux in thinking it's of an old PCI type with a 256-byte configuration space size, instead of the desired 4k. So add a PCI fixup to declare the correct size. Signed-off-by: Andrew Donnellan Signed-off-by: Frederic Barrat --- arch/powerpc/platforms/powernv/pci-ioda.c | 13 + 1 file changed, 13 insertions(+) diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index e780263a14ee..d5af700820f3 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -4080,6 +4080,19 @@ void __init pnv_pci_init_npu2_opencapi_phb(struct device_node *np) pnv_pci_init_ioda_phb(np, 0, PNV_PHB_NPU_OCAPI); } +static void pnv_npu2_opencapi_cfg_size_fixup(struct pci_dev *dev) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + + if (!machine_is(powernv)) + return; + + if (phb->type == PNV_PHB_NPU_OCAPI) + dev->cfg_size = PCI_CFG_SPACE_EXP_SIZE; +} +DECLARE_PCI_FIXUP_EARLY(PCI_ANY_ID, PCI_ANY_ID, pnv_npu2_opencapi_cfg_size_fixup); + void __init pnv_pci_init_ioda_hub(struct device_node *np) { struct device_node *phbn; -- 2.14.1
[PATCH v2 05/13] powerpc/powernv: Capture actag information for the device
In the opencapi protocol, host memory contexts are referenced by a 'actag'. During setup, a driver must tell the device how many actags it can used, and what values are acceptable. On POWER9, the NPU can handle 64 actags per link, so they must be shared between all the PCI functions of the link. To get a global picture of how many actags are used by each AFU of every function, we capture some data at the end of PCI enumeration, so that actags can be shared fairly if needed. This is not powernv specific per say, but rather a consequence of the opencapi configuration specification being quite general. The number of available actags on POWER9 makes it more likely to be hit. This is somewhat mitigated by the fact that existing AFUs are coded by requesting a reasonable count of actags and existing devices carry only one AFU. Signed-off-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- arch/powerpc/include/asm/pnv-ocxl.h | 4 + arch/powerpc/platforms/powernv/ocxl.c | 305 ++ include/misc/ocxl-config.h| 45 + 3 files changed, 354 insertions(+) create mode 100644 include/misc/ocxl-config.h diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h index 36868d49aeed..398d05b30600 100644 --- a/arch/powerpc/include/asm/pnv-ocxl.h +++ b/arch/powerpc/include/asm/pnv-ocxl.h @@ -9,6 +9,10 @@ #define PNV_OCXL_TL_BITS_PER_RATE 4 #define PNV_OCXL_TL_RATE_BUF_SIZE ((PNV_OCXL_TL_MAX_TEMPLATE+1) * PNV_OCXL_TL_BITS_PER_RATE / 8) +extern int pnv_ocxl_get_actag(struct pci_dev *dev, u16 *base, u16 *enabled, + u16 *supported); +extern int pnv_ocxl_get_pasid_count(struct pci_dev *dev, int *count); + extern int pnv_ocxl_get_tl_cap(struct pci_dev *dev, long *cap, char *rate_buf, int rate_buf_size); extern int pnv_ocxl_set_tl_conf(struct pci_dev *dev, long cap, diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c index d61186805a07..1faaa4ef6903 100644 --- a/arch/powerpc/platforms/powernv/ocxl.c +++ b/arch/powerpc/platforms/powernv/ocxl.c @@ -2,13 +2,318 @@ // Copyright 2017 IBM Corp. #include #include +#include #include "pci.h" #define PNV_OCXL_TL_P9_RECV_CAP0x000Full +#define PNV_OCXL_ACTAG_MAX 64 /* PASIDs are 20-bit, but on P9, NPU can only handle 15 bits */ #define PNV_OCXL_PASID_BITS15 #define PNV_OCXL_PASID_MAX ((1 << PNV_OCXL_PASID_BITS) - 1) +#define AFU_PRESENT (1 << 31) +#define AFU_INDEX_MASK 0x3F00 +#define AFU_INDEX_SHIFT 24 +#define ACTAG_MASK 0xFFF + + +struct actag_range { + u16 start; + u16 count; +}; + +struct npu_link { + struct list_head list; + int domain; + int bus; + int dev; + u16 fn_desired_actags[8]; + struct actag_range fn_actags[8]; + bool assignment_done; +}; +static struct list_head links_list = LIST_HEAD_INIT(links_list); +static DEFINE_MUTEX(links_list_lock); + + +/* + * opencapi actags handling: + * + * When sending commands, the opencapi device references the memory + * context it's targeting with an 'actag', which is really an alias + * for a (BDF, pasid) combination. When it receives a command, the NPU + * must do a lookup of the actag to identify the memory context. The + * hardware supports a finite number of actags per link (64 for + * POWER9). + * + * The device can carry multiple functions, and each function can have + * multiple AFUs. Each AFU advertises in its config space the number + * of desired actags. The host must configure in the config space of + * the AFU how many actags the AFU is really allowed to use (which can + * be less than what the AFU desires). + * + * When a PCI function is probed by the driver, it has no visibility + * about the other PCI functions and how many actags they'd like, + * which makes it impossible to distribute actags fairly among AFUs. + * + * Unfortunately, the only way to know how many actags a function + * desires is by looking at the data for each AFU in the config space + * and add them up. Similarly, the only way to know how many actags + * all the functions of the physical device desire is by adding the + * previously computed function counts. Then we can match that against + * what the hardware supports. + * + * To get a comprehensive view, we use a 'pci fixup': at the end of + * PCI enumeration, each function counts how many actags its AFUs + * desire and we save it in a 'npu_link' structure, shared between all + * the PCI functions of a same device. Therefore, when the first + * function is probed by the driver, we can get an idea of the total + * count of desired actags for the device, and assign the actags to + * the AFUs, by pro-rating if needed. + */ + +static int find_dvsec_from_pos(struct pci_dev *dev, int dvsec_id, int pos) +{ + int vsec = pos; + u16 vendor
[PATCH v2 05/13] powerpc/powernv: Capture actag information for the device
In the opencapi protocol, host memory contexts are referenced by a 'actag'. During setup, a driver must tell the device how many actags it can used, and what values are acceptable. On POWER9, the NPU can handle 64 actags per link, so they must be shared between all the PCI functions of the link. To get a global picture of how many actags are used by each AFU of every function, we capture some data at the end of PCI enumeration, so that actags can be shared fairly if needed. This is not powernv specific per say, but rather a consequence of the opencapi configuration specification being quite general. The number of available actags on POWER9 makes it more likely to be hit. This is somewhat mitigated by the fact that existing AFUs are coded by requesting a reasonable count of actags and existing devices carry only one AFU. Signed-off-by: Frederic Barrat --- arch/powerpc/include/asm/pnv-ocxl.h | 4 + arch/powerpc/platforms/powernv/ocxl.c | 305 ++ include/misc/ocxl-config.h| 45 + 3 files changed, 354 insertions(+) create mode 100644 include/misc/ocxl-config.h diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h index 36868d49aeed..398d05b30600 100644 --- a/arch/powerpc/include/asm/pnv-ocxl.h +++ b/arch/powerpc/include/asm/pnv-ocxl.h @@ -9,6 +9,10 @@ #define PNV_OCXL_TL_BITS_PER_RATE 4 #define PNV_OCXL_TL_RATE_BUF_SIZE ((PNV_OCXL_TL_MAX_TEMPLATE+1) * PNV_OCXL_TL_BITS_PER_RATE / 8) +extern int pnv_ocxl_get_actag(struct pci_dev *dev, u16 *base, u16 *enabled, + u16 *supported); +extern int pnv_ocxl_get_pasid_count(struct pci_dev *dev, int *count); + extern int pnv_ocxl_get_tl_cap(struct pci_dev *dev, long *cap, char *rate_buf, int rate_buf_size); extern int pnv_ocxl_set_tl_conf(struct pci_dev *dev, long cap, diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c index d61186805a07..1faaa4ef6903 100644 --- a/arch/powerpc/platforms/powernv/ocxl.c +++ b/arch/powerpc/platforms/powernv/ocxl.c @@ -2,13 +2,318 @@ // Copyright 2017 IBM Corp. #include #include +#include #include "pci.h" #define PNV_OCXL_TL_P9_RECV_CAP0x000Full +#define PNV_OCXL_ACTAG_MAX 64 /* PASIDs are 20-bit, but on P9, NPU can only handle 15 bits */ #define PNV_OCXL_PASID_BITS15 #define PNV_OCXL_PASID_MAX ((1 << PNV_OCXL_PASID_BITS) - 1) +#define AFU_PRESENT (1 << 31) +#define AFU_INDEX_MASK 0x3F00 +#define AFU_INDEX_SHIFT 24 +#define ACTAG_MASK 0xFFF + + +struct actag_range { + u16 start; + u16 count; +}; + +struct npu_link { + struct list_head list; + int domain; + int bus; + int dev; + u16 fn_desired_actags[8]; + struct actag_range fn_actags[8]; + bool assignment_done; +}; +static struct list_head links_list = LIST_HEAD_INIT(links_list); +static DEFINE_MUTEX(links_list_lock); + + +/* + * opencapi actags handling: + * + * When sending commands, the opencapi device references the memory + * context it's targeting with an 'actag', which is really an alias + * for a (BDF, pasid) combination. When it receives a command, the NPU + * must do a lookup of the actag to identify the memory context. The + * hardware supports a finite number of actags per link (64 for + * POWER9). + * + * The device can carry multiple functions, and each function can have + * multiple AFUs. Each AFU advertises in its config space the number + * of desired actags. The host must configure in the config space of + * the AFU how many actags the AFU is really allowed to use (which can + * be less than what the AFU desires). + * + * When a PCI function is probed by the driver, it has no visibility + * about the other PCI functions and how many actags they'd like, + * which makes it impossible to distribute actags fairly among AFUs. + * + * Unfortunately, the only way to know how many actags a function + * desires is by looking at the data for each AFU in the config space + * and add them up. Similarly, the only way to know how many actags + * all the functions of the physical device desire is by adding the + * previously computed function counts. Then we can match that against + * what the hardware supports. + * + * To get a comprehensive view, we use a 'pci fixup': at the end of + * PCI enumeration, each function counts how many actags its AFUs + * desire and we save it in a 'npu_link' structure, shared between all + * the PCI functions of a same device. Therefore, when the first + * function is probed by the driver, we can get an idea of the total + * count of desired actags for the device, and assign the actags to + * the AFUs, by pro-rating if needed. + */ + +static int find_dvsec_from_pos(struct pci_dev *dev, int dvsec_id, int pos) +{ + int vsec = pos; + u16 vendor, id; + + while ((vsec = pci_find_next_ex
[PATCH v2 10/13] ocxl: Add Makefile and Kconfig
OCXL_BASE triggers the platform support needed by the driver. Signed-off-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> --- drivers/misc/Kconfig | 1 + drivers/misc/Makefile | 1 + drivers/misc/ocxl/Kconfig | 31 +++ drivers/misc/ocxl/Makefile | 11 +++ 4 files changed, 44 insertions(+) create mode 100644 drivers/misc/ocxl/Kconfig create mode 100644 drivers/misc/ocxl/Makefile diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index f1a5c2357b14..0534f338c84a 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -508,4 +508,5 @@ source "drivers/misc/mic/Kconfig" source "drivers/misc/genwqe/Kconfig" source "drivers/misc/echo/Kconfig" source "drivers/misc/cxl/Kconfig" +source "drivers/misc/ocxl/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index 5ca5f64df478..73326d54e246 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -55,6 +55,7 @@ obj-$(CONFIG_CXL_BASE)+= cxl/ obj-$(CONFIG_ASPEED_LPC_CTRL) += aspeed-lpc-ctrl.o obj-$(CONFIG_ASPEED_LPC_SNOOP) += aspeed-lpc-snoop.o obj-$(CONFIG_PCI_ENDPOINT_TEST)+= pci_endpoint_test.o +obj-$(CONFIG_OCXL) += ocxl/ lkdtm-$(CONFIG_LKDTM) += lkdtm_core.o lkdtm-$(CONFIG_LKDTM) += lkdtm_bugs.o diff --git a/drivers/misc/ocxl/Kconfig b/drivers/misc/ocxl/Kconfig new file mode 100644 index ..4bbdb0d3c8ee --- /dev/null +++ b/drivers/misc/ocxl/Kconfig @@ -0,0 +1,31 @@ +# +# Open Coherent Accelerator (OCXL) compatible devices +# + +config OCXL_BASE + bool + default n + select PPC_COPRO_BASE + +config OCXL + tristate "OpenCAPI coherent accelerator support" + depends on PPC_POWERNV && PCI && EEH + select OCXL_BASE + default m + help + Select this option to enable the ocxl driver for Open + Coherent Accelerator Processor Interface (OpenCAPI) devices. + + OpenCAPI allows FPGA and ASIC accelerators to be coherently + attached to a CPU over an OpenCAPI link. + + The ocxl driver enables userspace programs to access these + accelerators through devices in /dev/ocxl/. + + For more information, see http://opencapi.org. + + This is not to be confused with the support for IBM CAPI + accelerators (CONFIG_CXL), which are PCI-based instead of a + dedicated OpenCAPI link, and don't follow the same protocol. + + If unsure, say N. diff --git a/drivers/misc/ocxl/Makefile b/drivers/misc/ocxl/Makefile new file mode 100644 index ..5229dcda8297 --- /dev/null +++ b/drivers/misc/ocxl/Makefile @@ -0,0 +1,11 @@ +# SPDX-License-Identifier: GPL-2.0+ +ccflags-$(CONFIG_PPC_WERROR) += -Werror + +ocxl-y += main.o pci.o config.o file.o pasid.o +ocxl-y += link.o context.o afu_irq.o sysfs.o trace.o +obj-$(CONFIG_OCXL) += ocxl.o + +# For tracepoints to include our trace.h from tracepoint infrastructure: +CFLAGS_trace.o := -I$(src) + +# ccflags-y += -DDEBUG -- 2.14.1
[PATCH v2 10/13] ocxl: Add Makefile and Kconfig
OCXL_BASE triggers the platform support needed by the driver. Signed-off-by: Frederic Barrat Signed-off-by: Andrew Donnellan --- drivers/misc/Kconfig | 1 + drivers/misc/Makefile | 1 + drivers/misc/ocxl/Kconfig | 31 +++ drivers/misc/ocxl/Makefile | 11 +++ 4 files changed, 44 insertions(+) create mode 100644 drivers/misc/ocxl/Kconfig create mode 100644 drivers/misc/ocxl/Makefile diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index f1a5c2357b14..0534f338c84a 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -508,4 +508,5 @@ source "drivers/misc/mic/Kconfig" source "drivers/misc/genwqe/Kconfig" source "drivers/misc/echo/Kconfig" source "drivers/misc/cxl/Kconfig" +source "drivers/misc/ocxl/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index 5ca5f64df478..73326d54e246 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -55,6 +55,7 @@ obj-$(CONFIG_CXL_BASE)+= cxl/ obj-$(CONFIG_ASPEED_LPC_CTRL) += aspeed-lpc-ctrl.o obj-$(CONFIG_ASPEED_LPC_SNOOP) += aspeed-lpc-snoop.o obj-$(CONFIG_PCI_ENDPOINT_TEST)+= pci_endpoint_test.o +obj-$(CONFIG_OCXL) += ocxl/ lkdtm-$(CONFIG_LKDTM) += lkdtm_core.o lkdtm-$(CONFIG_LKDTM) += lkdtm_bugs.o diff --git a/drivers/misc/ocxl/Kconfig b/drivers/misc/ocxl/Kconfig new file mode 100644 index ..4bbdb0d3c8ee --- /dev/null +++ b/drivers/misc/ocxl/Kconfig @@ -0,0 +1,31 @@ +# +# Open Coherent Accelerator (OCXL) compatible devices +# + +config OCXL_BASE + bool + default n + select PPC_COPRO_BASE + +config OCXL + tristate "OpenCAPI coherent accelerator support" + depends on PPC_POWERNV && PCI && EEH + select OCXL_BASE + default m + help + Select this option to enable the ocxl driver for Open + Coherent Accelerator Processor Interface (OpenCAPI) devices. + + OpenCAPI allows FPGA and ASIC accelerators to be coherently + attached to a CPU over an OpenCAPI link. + + The ocxl driver enables userspace programs to access these + accelerators through devices in /dev/ocxl/. + + For more information, see http://opencapi.org. + + This is not to be confused with the support for IBM CAPI + accelerators (CONFIG_CXL), which are PCI-based instead of a + dedicated OpenCAPI link, and don't follow the same protocol. + + If unsure, say N. diff --git a/drivers/misc/ocxl/Makefile b/drivers/misc/ocxl/Makefile new file mode 100644 index ..5229dcda8297 --- /dev/null +++ b/drivers/misc/ocxl/Makefile @@ -0,0 +1,11 @@ +# SPDX-License-Identifier: GPL-2.0+ +ccflags-$(CONFIG_PPC_WERROR) += -Werror + +ocxl-y += main.o pci.o config.o file.o pasid.o +ocxl-y += link.o context.o afu_irq.o sysfs.o trace.o +obj-$(CONFIG_OCXL) += ocxl.o + +# For tracepoints to include our trace.h from tracepoint infrastructure: +CFLAGS_trace.o := -I$(src) + +# ccflags-y += -DDEBUG -- 2.14.1
[PATCH v2 04/13] powerpc/powernv: Add platform-specific services for opencapi
Implement a few platform-specific calls which can be used by drivers: - provide the Transaction Layer capabilities of the host, so that the driver can find some common ground and configure the device and host appropriately. - provide the hw interrupt to be used for translation faults raised by the NPU - map/unmap some NPU mmio registers to get the fault context when the NPU raises an address translation fault The rest are wrappers around the previously-introduced opal calls. Signed-off-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- arch/powerpc/include/asm/pnv-ocxl.h | 29 + arch/powerpc/platforms/powernv/Makefile | 1 + arch/powerpc/platforms/powernv/ocxl.c | 180 3 files changed, 210 insertions(+) create mode 100644 arch/powerpc/include/asm/pnv-ocxl.h create mode 100644 arch/powerpc/platforms/powernv/ocxl.c diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h new file mode 100644 index ..36868d49aeed --- /dev/null +++ b/arch/powerpc/include/asm/pnv-ocxl.h @@ -0,0 +1,29 @@ +// SPDX-License-Identifier: GPL-2.0+ +// Copyright 2017 IBM Corp. +#ifndef _ASM_PNV_OCXL_H +#define _ASM_PNV_OCXL_H + +#include + +#define PNV_OCXL_TL_MAX_TEMPLATE63 +#define PNV_OCXL_TL_BITS_PER_RATE 4 +#define PNV_OCXL_TL_RATE_BUF_SIZE ((PNV_OCXL_TL_MAX_TEMPLATE+1) * PNV_OCXL_TL_BITS_PER_RATE / 8) + +extern int pnv_ocxl_get_tl_cap(struct pci_dev *dev, long *cap, + char *rate_buf, int rate_buf_size); +extern int pnv_ocxl_set_tl_conf(struct pci_dev *dev, long cap, + uint64_t rate_buf_phys, int rate_buf_size); + +extern int pnv_ocxl_get_xsl_irq(struct pci_dev *dev, int *hwirq); +extern void pnv_ocxl_unmap_xsl_regs(void __iomem *dsisr, void __iomem *dar, + void __iomem *tfc, void __iomem *pe_handle); +extern int pnv_ocxl_map_xsl_regs(struct pci_dev *dev, void __iomem **dsisr, + void __iomem **dar, void __iomem **tfc, + void __iomem **pe_handle); + +extern int pnv_ocxl_spa_setup(struct pci_dev *dev, void *spa_mem, int PE_mask, + void **platform_data); +extern void pnv_ocxl_spa_release(void *platform_data); +extern int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle); + +#endif /* _ASM_PNV_OCXL_H */ diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile index 3732118a0482..6c9d5199a7e2 100644 --- a/arch/powerpc/platforms/powernv/Makefile +++ b/arch/powerpc/platforms/powernv/Makefile @@ -17,3 +17,4 @@ obj-$(CONFIG_PERF_EVENTS) += opal-imc.o obj-$(CONFIG_PPC_MEMTRACE) += memtrace.o obj-$(CONFIG_PPC_VAS) += vas.o vas-window.o vas-debug.o obj-$(CONFIG_PPC_FTW) += nx-ftw.o +obj-$(CONFIG_OCXL_BASE)+= ocxl.o diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c new file mode 100644 index ..d61186805a07 --- /dev/null +++ b/arch/powerpc/platforms/powernv/ocxl.c @@ -0,0 +1,180 @@ +// SPDX-License-Identifier: GPL-2.0+ +// Copyright 2017 IBM Corp. +#include +#include +#include "pci.h" + +#define PNV_OCXL_TL_P9_RECV_CAP0x000Full +/* PASIDs are 20-bit, but on P9, NPU can only handle 15 bits */ +#define PNV_OCXL_PASID_BITS15 +#define PNV_OCXL_PASID_MAX ((1 << PNV_OCXL_PASID_BITS) - 1) + + +static void set_templ_rate(unsigned int templ, unsigned int rate, char *buf) +{ + int shift, idx; + + WARN_ON(templ > PNV_OCXL_TL_MAX_TEMPLATE); + idx = (PNV_OCXL_TL_MAX_TEMPLATE - templ) / 2; + shift = 4 * (1 - ((PNV_OCXL_TL_MAX_TEMPLATE - templ) % 2)); + buf[idx] |= rate << shift; +} + +int pnv_ocxl_get_tl_cap(struct pci_dev *dev, long *cap, + char *rate_buf, int rate_buf_size) +{ + if (rate_buf_size != PNV_OCXL_TL_RATE_BUF_SIZE) + return -EINVAL; + /* +* The TL capabilities are a characteristic of the NPU, so +* we go with hard-coded values. +* +* The receiving rate of each template is encoded on 4 bits. +* +* On P9: +* - templates 0 -> 3 are supported +* - templates 0, 1 and 3 have a 0 receiving rate +* - template 2 has receiving rate of 1 (extra cycle) +*/ + memset(rate_buf, 0, rate_buf_size); + set_templ_rate(2, 1, rate_buf); + *cap = PNV_OCXL_TL_P9_RECV_CAP; + return 0; +} +EXPORT_SYMBOL_GPL(pnv_ocxl_get_tl_cap); + +int pnv_ocxl_set_tl_conf(struct pci_dev *dev, long cap, + uint64_t rate_buf_phys, int rate_buf_size) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + int rc; + + if (rate_buf_size != PNV_OCXL_TL_RATE_BUF_SIZE) + return -EINVAL; + + rc = opal_
[PATCH v2 04/13] powerpc/powernv: Add platform-specific services for opencapi
Implement a few platform-specific calls which can be used by drivers: - provide the Transaction Layer capabilities of the host, so that the driver can find some common ground and configure the device and host appropriately. - provide the hw interrupt to be used for translation faults raised by the NPU - map/unmap some NPU mmio registers to get the fault context when the NPU raises an address translation fault The rest are wrappers around the previously-introduced opal calls. Signed-off-by: Frederic Barrat --- arch/powerpc/include/asm/pnv-ocxl.h | 29 + arch/powerpc/platforms/powernv/Makefile | 1 + arch/powerpc/platforms/powernv/ocxl.c | 180 3 files changed, 210 insertions(+) create mode 100644 arch/powerpc/include/asm/pnv-ocxl.h create mode 100644 arch/powerpc/platforms/powernv/ocxl.c diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h new file mode 100644 index ..36868d49aeed --- /dev/null +++ b/arch/powerpc/include/asm/pnv-ocxl.h @@ -0,0 +1,29 @@ +// SPDX-License-Identifier: GPL-2.0+ +// Copyright 2017 IBM Corp. +#ifndef _ASM_PNV_OCXL_H +#define _ASM_PNV_OCXL_H + +#include + +#define PNV_OCXL_TL_MAX_TEMPLATE63 +#define PNV_OCXL_TL_BITS_PER_RATE 4 +#define PNV_OCXL_TL_RATE_BUF_SIZE ((PNV_OCXL_TL_MAX_TEMPLATE+1) * PNV_OCXL_TL_BITS_PER_RATE / 8) + +extern int pnv_ocxl_get_tl_cap(struct pci_dev *dev, long *cap, + char *rate_buf, int rate_buf_size); +extern int pnv_ocxl_set_tl_conf(struct pci_dev *dev, long cap, + uint64_t rate_buf_phys, int rate_buf_size); + +extern int pnv_ocxl_get_xsl_irq(struct pci_dev *dev, int *hwirq); +extern void pnv_ocxl_unmap_xsl_regs(void __iomem *dsisr, void __iomem *dar, + void __iomem *tfc, void __iomem *pe_handle); +extern int pnv_ocxl_map_xsl_regs(struct pci_dev *dev, void __iomem **dsisr, + void __iomem **dar, void __iomem **tfc, + void __iomem **pe_handle); + +extern int pnv_ocxl_spa_setup(struct pci_dev *dev, void *spa_mem, int PE_mask, + void **platform_data); +extern void pnv_ocxl_spa_release(void *platform_data); +extern int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle); + +#endif /* _ASM_PNV_OCXL_H */ diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile index 3732118a0482..6c9d5199a7e2 100644 --- a/arch/powerpc/platforms/powernv/Makefile +++ b/arch/powerpc/platforms/powernv/Makefile @@ -17,3 +17,4 @@ obj-$(CONFIG_PERF_EVENTS) += opal-imc.o obj-$(CONFIG_PPC_MEMTRACE) += memtrace.o obj-$(CONFIG_PPC_VAS) += vas.o vas-window.o vas-debug.o obj-$(CONFIG_PPC_FTW) += nx-ftw.o +obj-$(CONFIG_OCXL_BASE)+= ocxl.o diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c new file mode 100644 index ..d61186805a07 --- /dev/null +++ b/arch/powerpc/platforms/powernv/ocxl.c @@ -0,0 +1,180 @@ +// SPDX-License-Identifier: GPL-2.0+ +// Copyright 2017 IBM Corp. +#include +#include +#include "pci.h" + +#define PNV_OCXL_TL_P9_RECV_CAP0x000Full +/* PASIDs are 20-bit, but on P9, NPU can only handle 15 bits */ +#define PNV_OCXL_PASID_BITS15 +#define PNV_OCXL_PASID_MAX ((1 << PNV_OCXL_PASID_BITS) - 1) + + +static void set_templ_rate(unsigned int templ, unsigned int rate, char *buf) +{ + int shift, idx; + + WARN_ON(templ > PNV_OCXL_TL_MAX_TEMPLATE); + idx = (PNV_OCXL_TL_MAX_TEMPLATE - templ) / 2; + shift = 4 * (1 - ((PNV_OCXL_TL_MAX_TEMPLATE - templ) % 2)); + buf[idx] |= rate << shift; +} + +int pnv_ocxl_get_tl_cap(struct pci_dev *dev, long *cap, + char *rate_buf, int rate_buf_size) +{ + if (rate_buf_size != PNV_OCXL_TL_RATE_BUF_SIZE) + return -EINVAL; + /* +* The TL capabilities are a characteristic of the NPU, so +* we go with hard-coded values. +* +* The receiving rate of each template is encoded on 4 bits. +* +* On P9: +* - templates 0 -> 3 are supported +* - templates 0, 1 and 3 have a 0 receiving rate +* - template 2 has receiving rate of 1 (extra cycle) +*/ + memset(rate_buf, 0, rate_buf_size); + set_templ_rate(2, 1, rate_buf); + *cap = PNV_OCXL_TL_P9_RECV_CAP; + return 0; +} +EXPORT_SYMBOL_GPL(pnv_ocxl_get_tl_cap); + +int pnv_ocxl_set_tl_conf(struct pci_dev *dev, long cap, + uint64_t rate_buf_phys, int rate_buf_size) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + int rc; + + if (rate_buf_size != PNV_OCXL_TL_RATE_BUF_SIZE) + return -EINVAL; + + rc = opal_npu_tl
[PATCH v2 06/13] ocxl: Driver code for 'generic' opencapi devices
Add an ocxl driver to handle generic opencapi devices. Of course, it's not meant to be the only opencapi driver, any device is free to implement its own. But if a host application only needs basic services like attaching to an opencapi adapter, have translation faults handled or allocate AFU interrupts, it should suffice. The AFU config space must follow the opencapi specification and use the expected vendor/device ID to be seen by the generic driver. The driver exposes the device AFUs as a char device in /dev/ocxl/ Note that the driver currently doesn't handle memory attached to the opencapi device. Signed-off-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Signed-off-by: Alastair D'Silva <alast...@d-silva.org> --- drivers/misc/ocxl/config.c| 712 ++ drivers/misc/ocxl/context.c | 230 drivers/misc/ocxl/file.c | 398 + drivers/misc/ocxl/link.c | 603 drivers/misc/ocxl/main.c | 33 ++ drivers/misc/ocxl/ocxl_internal.h | 193 +++ drivers/misc/ocxl/pasid.c | 107 ++ drivers/misc/ocxl/pci.c | 585 +++ drivers/misc/ocxl/sysfs.c | 142 include/uapi/misc/ocxl.h | 40 +++ 10 files changed, 3043 insertions(+) create mode 100644 drivers/misc/ocxl/config.c create mode 100644 drivers/misc/ocxl/context.c create mode 100644 drivers/misc/ocxl/file.c create mode 100644 drivers/misc/ocxl/link.c create mode 100644 drivers/misc/ocxl/main.c create mode 100644 drivers/misc/ocxl/ocxl_internal.h create mode 100644 drivers/misc/ocxl/pasid.c create mode 100644 drivers/misc/ocxl/pci.c create mode 100644 drivers/misc/ocxl/sysfs.c create mode 100644 include/uapi/misc/ocxl.h diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c new file mode 100644 index ..ea8cca50ea06 --- /dev/null +++ b/drivers/misc/ocxl/config.c @@ -0,0 +1,712 @@ +// SPDX-License-Identifier: GPL-2.0+ +// Copyright 2017 IBM Corp. +#include +#include +#include +#include "ocxl_internal.h" + +#define EXTRACT_BIT(val, bit) (!!(val & BIT(bit))) +#define EXTRACT_BITS(val, s, e) ((val & GENMASK(e, s)) >> s) + +#define OCXL_DVSEC_AFU_IDX_MASK GENMASK(5, 0) +#define OCXL_DVSEC_ACTAG_MASKGENMASK(11, 0) +#define OCXL_DVSEC_PASID_MASKGENMASK(19, 0) +#define OCXL_DVSEC_PASID_LOG_MASKGENMASK(4, 0) + +#define OCXL_DVSEC_TEMPL_VERSION 0x0 +#define OCXL_DVSEC_TEMPL_NAME0x4 +#define OCXL_DVSEC_TEMPL_AFU_VERSION 0x1C +#define OCXL_DVSEC_TEMPL_MMIO_GLOBAL 0x20 +#define OCXL_DVSEC_TEMPL_MMIO_GLOBAL_SZ 0x28 +#define OCXL_DVSEC_TEMPL_MMIO_PP 0x30 +#define OCXL_DVSEC_TEMPL_MMIO_PP_SZ 0x38 +#define OCXL_DVSEC_TEMPL_MEM_SZ 0x3C +#define OCXL_DVSEC_TEMPL_WWID0x40 + +#define OCXL_MAX_AFU_PER_FUNCTION 64 +#define OCXL_TEMPL_LEN0x58 +#define OCXL_TEMPL_NAME_LEN 24 +#define OCXL_CFG_TIMEOUT 3 + +static int find_dvsec(struct pci_dev *dev, int dvsec_id) +{ + int vsec = 0; + u16 vendor, id; + + while ((vsec = pci_find_next_ext_capability(dev, vsec, + OCXL_EXT_CAP_ID_DVSEC))) { + pci_read_config_word(dev, vsec + OCXL_DVSEC_VENDOR_OFFSET, + ); + pci_read_config_word(dev, vsec + OCXL_DVSEC_ID_OFFSET, ); + if (vendor == PCI_VENDOR_ID_IBM && id == dvsec_id) + return vsec; + } + return 0; +} + +static int find_dvsec_afu_ctrl(struct pci_dev *dev, u8 afu_idx) +{ + int vsec = 0; + u16 vendor, id; + u8 idx; + + while ((vsec = pci_find_next_ext_capability(dev, vsec, + OCXL_EXT_CAP_ID_DVSEC))) { + pci_read_config_word(dev, vsec + OCXL_DVSEC_VENDOR_OFFSET, + ); + pci_read_config_word(dev, vsec + OCXL_DVSEC_ID_OFFSET, ); + + if (vendor == PCI_VENDOR_ID_IBM && + id == OCXL_DVSEC_AFU_CTRL_ID) { + pci_read_config_byte(dev, + vsec + OCXL_DVSEC_AFU_CTRL_AFU_IDX, + ); + if (idx == afu_idx) + return vsec; + } + } + return 0; +} + +static int read_pasid(struct pci_dev *dev, struct ocxl_fn_config *fn) +{ + u16 val; + int pos; + + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_PASID); + if (!pos) { + /* +* PASID capability is not mandatory, but there +* shouldn't be any AFU +*/ +
[PATCH v2 06/13] ocxl: Driver code for 'generic' opencapi devices
Add an ocxl driver to handle generic opencapi devices. Of course, it's not meant to be the only opencapi driver, any device is free to implement its own. But if a host application only needs basic services like attaching to an opencapi adapter, have translation faults handled or allocate AFU interrupts, it should suffice. The AFU config space must follow the opencapi specification and use the expected vendor/device ID to be seen by the generic driver. The driver exposes the device AFUs as a char device in /dev/ocxl/ Note that the driver currently doesn't handle memory attached to the opencapi device. Signed-off-by: Frederic Barrat Signed-off-by: Andrew Donnellan Signed-off-by: Alastair D'Silva --- drivers/misc/ocxl/config.c| 712 ++ drivers/misc/ocxl/context.c | 230 drivers/misc/ocxl/file.c | 398 + drivers/misc/ocxl/link.c | 603 drivers/misc/ocxl/main.c | 33 ++ drivers/misc/ocxl/ocxl_internal.h | 193 +++ drivers/misc/ocxl/pasid.c | 107 ++ drivers/misc/ocxl/pci.c | 585 +++ drivers/misc/ocxl/sysfs.c | 142 include/uapi/misc/ocxl.h | 40 +++ 10 files changed, 3043 insertions(+) create mode 100644 drivers/misc/ocxl/config.c create mode 100644 drivers/misc/ocxl/context.c create mode 100644 drivers/misc/ocxl/file.c create mode 100644 drivers/misc/ocxl/link.c create mode 100644 drivers/misc/ocxl/main.c create mode 100644 drivers/misc/ocxl/ocxl_internal.h create mode 100644 drivers/misc/ocxl/pasid.c create mode 100644 drivers/misc/ocxl/pci.c create mode 100644 drivers/misc/ocxl/sysfs.c create mode 100644 include/uapi/misc/ocxl.h diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c new file mode 100644 index ..ea8cca50ea06 --- /dev/null +++ b/drivers/misc/ocxl/config.c @@ -0,0 +1,712 @@ +// SPDX-License-Identifier: GPL-2.0+ +// Copyright 2017 IBM Corp. +#include +#include +#include +#include "ocxl_internal.h" + +#define EXTRACT_BIT(val, bit) (!!(val & BIT(bit))) +#define EXTRACT_BITS(val, s, e) ((val & GENMASK(e, s)) >> s) + +#define OCXL_DVSEC_AFU_IDX_MASK GENMASK(5, 0) +#define OCXL_DVSEC_ACTAG_MASKGENMASK(11, 0) +#define OCXL_DVSEC_PASID_MASKGENMASK(19, 0) +#define OCXL_DVSEC_PASID_LOG_MASKGENMASK(4, 0) + +#define OCXL_DVSEC_TEMPL_VERSION 0x0 +#define OCXL_DVSEC_TEMPL_NAME0x4 +#define OCXL_DVSEC_TEMPL_AFU_VERSION 0x1C +#define OCXL_DVSEC_TEMPL_MMIO_GLOBAL 0x20 +#define OCXL_DVSEC_TEMPL_MMIO_GLOBAL_SZ 0x28 +#define OCXL_DVSEC_TEMPL_MMIO_PP 0x30 +#define OCXL_DVSEC_TEMPL_MMIO_PP_SZ 0x38 +#define OCXL_DVSEC_TEMPL_MEM_SZ 0x3C +#define OCXL_DVSEC_TEMPL_WWID0x40 + +#define OCXL_MAX_AFU_PER_FUNCTION 64 +#define OCXL_TEMPL_LEN0x58 +#define OCXL_TEMPL_NAME_LEN 24 +#define OCXL_CFG_TIMEOUT 3 + +static int find_dvsec(struct pci_dev *dev, int dvsec_id) +{ + int vsec = 0; + u16 vendor, id; + + while ((vsec = pci_find_next_ext_capability(dev, vsec, + OCXL_EXT_CAP_ID_DVSEC))) { + pci_read_config_word(dev, vsec + OCXL_DVSEC_VENDOR_OFFSET, + ); + pci_read_config_word(dev, vsec + OCXL_DVSEC_ID_OFFSET, ); + if (vendor == PCI_VENDOR_ID_IBM && id == dvsec_id) + return vsec; + } + return 0; +} + +static int find_dvsec_afu_ctrl(struct pci_dev *dev, u8 afu_idx) +{ + int vsec = 0; + u16 vendor, id; + u8 idx; + + while ((vsec = pci_find_next_ext_capability(dev, vsec, + OCXL_EXT_CAP_ID_DVSEC))) { + pci_read_config_word(dev, vsec + OCXL_DVSEC_VENDOR_OFFSET, + ); + pci_read_config_word(dev, vsec + OCXL_DVSEC_ID_OFFSET, ); + + if (vendor == PCI_VENDOR_ID_IBM && + id == OCXL_DVSEC_AFU_CTRL_ID) { + pci_read_config_byte(dev, + vsec + OCXL_DVSEC_AFU_CTRL_AFU_IDX, + ); + if (idx == afu_idx) + return vsec; + } + } + return 0; +} + +static int read_pasid(struct pci_dev *dev, struct ocxl_fn_config *fn) +{ + u16 val; + int pos; + + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_PASID); + if (!pos) { + /* +* PASID capability is not mandatory, but there +* shouldn't be any AFU +*/ + dev_dbg(>dev, "Function doesn't require any PASID\n"); +
[PATCH v2 07/13] ocxl: Add AFU interrupt support
Add user APIs through ioctl to allocate, free, and be notified of an AFU interrupt. For opencapi, an AFU can trigger an interrupt on the host by sending a specific command targeting a 64-bit object handle. On POWER9, this is implemented by mapping a special page in the address space of a process and a write to that page will trigger an interrupt. Signed-off-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- arch/powerpc/include/asm/pnv-ocxl.h | 3 + arch/powerpc/platforms/powernv/ocxl.c | 30 ++ drivers/misc/ocxl/afu_irq.c | 197 ++ drivers/misc/ocxl/context.c | 51 - drivers/misc/ocxl/file.c | 34 ++ drivers/misc/ocxl/link.c | 28 + drivers/misc/ocxl/ocxl_internal.h | 7 ++ include/uapi/misc/ocxl.h | 9 ++ 8 files changed, 357 insertions(+), 2 deletions(-) create mode 100644 drivers/misc/ocxl/afu_irq.c diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h index 398d05b30600..f6945d3bc971 100644 --- a/arch/powerpc/include/asm/pnv-ocxl.h +++ b/arch/powerpc/include/asm/pnv-ocxl.h @@ -30,4 +30,7 @@ extern int pnv_ocxl_spa_setup(struct pci_dev *dev, void *spa_mem, int PE_mask, extern void pnv_ocxl_spa_release(void *platform_data); extern int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle); +extern int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr); +extern void pnv_ocxl_free_xive_irq(u32 irq); + #endif /* _ASM_PNV_OCXL_H */ diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c index 1faaa4ef6903..fa9b53af3c7b 100644 --- a/arch/powerpc/platforms/powernv/ocxl.c +++ b/arch/powerpc/platforms/powernv/ocxl.c @@ -2,6 +2,7 @@ // Copyright 2017 IBM Corp. #include #include +#include #include #include "pci.h" @@ -483,3 +484,32 @@ int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle) return rc; } EXPORT_SYMBOL_GPL(pnv_ocxl_spa_remove_pe); + +int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr) +{ + __be64 flags, trigger_page; + s64 rc; + u32 hwirq; + + hwirq = xive_native_alloc_irq(); + if (!hwirq) + return -ENOENT; + + rc = opal_xive_get_irq_info(hwirq, , NULL, _page, NULL, + NULL); + if (rc || !trigger_page) { + xive_native_free_irq(hwirq); + return -ENOENT; + } + *irq = hwirq; + *trigger_addr = be64_to_cpu(trigger_page); + return 0; + +} +EXPORT_SYMBOL_GPL(pnv_ocxl_alloc_xive_irq); + +void pnv_ocxl_free_xive_irq(u32 irq) +{ + xive_native_free_irq(irq); +} +EXPORT_SYMBOL_GPL(pnv_ocxl_free_xive_irq); diff --git a/drivers/misc/ocxl/afu_irq.c b/drivers/misc/ocxl/afu_irq.c new file mode 100644 index ..f40d853de401 --- /dev/null +++ b/drivers/misc/ocxl/afu_irq.c @@ -0,0 +1,197 @@ +// SPDX-License-Identifier: GPL-2.0+ +// Copyright 2017 IBM Corp. +#include +#include +#include +#include "ocxl_internal.h" + +struct afu_irq { + int id; + int hw_irq; + unsigned int virq; + char *name; + u64 trigger_page; + struct eventfd_ctx *ev_ctx; +}; + +static int irq_offset_to_id(struct ocxl_context *ctx, u64 offset) +{ + return (offset - ctx->afu->irq_base_offset) >> PAGE_SHIFT; +} + +static u64 irq_id_to_offset(struct ocxl_context *ctx, int id) +{ + return ctx->afu->irq_base_offset + (id << PAGE_SHIFT); +} + +static irqreturn_t afu_irq_handler(int virq, void *data) +{ + struct afu_irq *irq = (struct afu_irq *) data; + + if (irq->ev_ctx) + eventfd_signal(irq->ev_ctx, 1); + return IRQ_HANDLED; +} + +static int setup_afu_irq(struct ocxl_context *ctx, struct afu_irq *irq) +{ + int rc; + + irq->virq = irq_create_mapping(NULL, irq->hw_irq); + if (!irq->virq) { + pr_err("irq_create_mapping failed\n"); + return -ENOMEM; + } + pr_debug("hw_irq %d mapped to virq %u\n", irq->hw_irq, irq->virq); + + irq->name = kasprintf(GFP_KERNEL, "ocxl-afu-%u", irq->virq); + if (!irq->name) { + irq_dispose_mapping(irq->virq); + return -ENOMEM; + } + + rc = request_irq(irq->virq, afu_irq_handler, 0, irq->name, irq); + if (rc) { + kfree(irq->name); + irq->name = NULL; + irq_dispose_mapping(irq->virq); + pr_err("request_irq failed: %d\n", rc); + return rc; + } + return 0; +} + +static void release_afu_irq(struct afu_irq *irq) +{ + free_irq(irq->virq, irq); + irq_dispose_mapping(irq->virq); + kfree(irq->name); +} + +int ocxl_afu_irq_alloc(struct ocxl_context *ctx, u64 *irq_offset) +{ + struct afu_irq *irq; +
[PATCH v2 12/13] ocxl: Documentation
ocxl.rst gives a quick, high-level view of opencapi. Update ioctl-number.txt to reflect ioctl numbers being used by the ocxl driver Signed-off-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- Documentation/ABI/testing/sysfs-class-ocxl | 35 +++ Documentation/accelerators/ocxl.rst| 160 + Documentation/ioctl/ioctl-number.txt | 1 + 3 files changed, 196 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-class-ocxl create mode 100644 Documentation/accelerators/ocxl.rst diff --git a/Documentation/ABI/testing/sysfs-class-ocxl b/Documentation/ABI/testing/sysfs-class-ocxl new file mode 100644 index ..ac11deb71235 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-class-ocxl @@ -0,0 +1,35 @@ +What: /sys/class/ocxl//afu_version +Date: January 2018 +Contact:linuxppc-...@lists.ozlabs.org +Description:read only +Version of the AFU, in the format : + Reflects what is read in the configuration space of the AFU + +What: /sys/class/ocxl//contexts +Date: January 2018 +Contact:linuxppc-...@lists.ozlabs.org +Description:read only + Number of contexts for the AFU, in the format / + where: + n: number of currently active contexts, for debug + max: maximum number of contexts supported by the AFU + +What: /sys/class/ocxl//pp_mmio_size +Date: January 2018 +Contact:linuxppc-...@lists.ozlabs.org +Description:read only + Size of the per-process mmio area, as defined in the + configuration space of the AFU + +What: /sys/class/ocxl//global_mmio_size +Date: January 2018 +Contact:linuxppc-...@lists.ozlabs.org +Description:read only + Size of the global mmio area, as defined in the + configuration space of the AFU + +What: /sys/class/ocxl//global_mmio_area +Date: January 2018 +Contact:linuxppc-...@lists.ozlabs.org +Description:read/write + Give access the global mmio area for the AFU diff --git a/Documentation/accelerators/ocxl.rst b/Documentation/accelerators/ocxl.rst new file mode 100644 index ..4f7af841d935 --- /dev/null +++ b/Documentation/accelerators/ocxl.rst @@ -0,0 +1,160 @@ + +OpenCAPI (Open Coherent Accelerator Processor Interface) + + +OpenCAPI is an interface between processors and accelerators. It aims +at being low-latency and high-bandwidth. The specification is +developed by the `OpenCAPI Consortium <http://opencapi.org/>`_. + +It allows an accelerator (which could be a FPGA, ASICs, ...) to access +the host memory coherently, using virtual addresses. An OpenCAPI +device can also host its own memory, that can be accessed from the +host. + +OpenCAPI is known in linux as 'ocxl', as the open, processor-agnostic +evolution of 'cxl' (the driver for the IBM CAPI interface for +powerpc), which was named that way to avoid confusion with the ISDN +CAPI subsystem. + + +High-level view +=== + +OpenCAPI defines a Data Link Layer (DL) and Transaction Layer (TL), to +be implemented on top of a physical link. Any processor or device +implementing the DL and TL can start sharing memory. + +:: + + +---+ +-+ + | | | | + | | | Accelerated | + | Processor | | Function | + | | ++ |Unit | ++ + | |--| Memory | |(AFU)|--| Memory | + | | ++ | | ++ + +---+ +-+ + | | + +---+ +-+ + |TL | |TLX | + +---+ +-+ + | | + +---+ +-+ + |DL | |DLX | + +---+ +-+ + | | + | PHY | + +---+ + + + +Device discovery + + +OpenCAPI relies on a PCI-like configuration space, implemented on the +device. So the host can discover AFUs by querying the config space. + +OpenCAPI devices in Linux are treated like PCI devices (with a few +caveats). The firmware is expected to abstract the hardware as if it +was a PCI link. A lot of the existing PCI infrastructure is reused: +devices are scanned and BARs are ass
[PATCH v2 07/13] ocxl: Add AFU interrupt support
Add user APIs through ioctl to allocate, free, and be notified of an AFU interrupt. For opencapi, an AFU can trigger an interrupt on the host by sending a specific command targeting a 64-bit object handle. On POWER9, this is implemented by mapping a special page in the address space of a process and a write to that page will trigger an interrupt. Signed-off-by: Frederic Barrat --- arch/powerpc/include/asm/pnv-ocxl.h | 3 + arch/powerpc/platforms/powernv/ocxl.c | 30 ++ drivers/misc/ocxl/afu_irq.c | 197 ++ drivers/misc/ocxl/context.c | 51 - drivers/misc/ocxl/file.c | 34 ++ drivers/misc/ocxl/link.c | 28 + drivers/misc/ocxl/ocxl_internal.h | 7 ++ include/uapi/misc/ocxl.h | 9 ++ 8 files changed, 357 insertions(+), 2 deletions(-) create mode 100644 drivers/misc/ocxl/afu_irq.c diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h index 398d05b30600..f6945d3bc971 100644 --- a/arch/powerpc/include/asm/pnv-ocxl.h +++ b/arch/powerpc/include/asm/pnv-ocxl.h @@ -30,4 +30,7 @@ extern int pnv_ocxl_spa_setup(struct pci_dev *dev, void *spa_mem, int PE_mask, extern void pnv_ocxl_spa_release(void *platform_data); extern int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle); +extern int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr); +extern void pnv_ocxl_free_xive_irq(u32 irq); + #endif /* _ASM_PNV_OCXL_H */ diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c index 1faaa4ef6903..fa9b53af3c7b 100644 --- a/arch/powerpc/platforms/powernv/ocxl.c +++ b/arch/powerpc/platforms/powernv/ocxl.c @@ -2,6 +2,7 @@ // Copyright 2017 IBM Corp. #include #include +#include #include #include "pci.h" @@ -483,3 +484,32 @@ int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle) return rc; } EXPORT_SYMBOL_GPL(pnv_ocxl_spa_remove_pe); + +int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr) +{ + __be64 flags, trigger_page; + s64 rc; + u32 hwirq; + + hwirq = xive_native_alloc_irq(); + if (!hwirq) + return -ENOENT; + + rc = opal_xive_get_irq_info(hwirq, , NULL, _page, NULL, + NULL); + if (rc || !trigger_page) { + xive_native_free_irq(hwirq); + return -ENOENT; + } + *irq = hwirq; + *trigger_addr = be64_to_cpu(trigger_page); + return 0; + +} +EXPORT_SYMBOL_GPL(pnv_ocxl_alloc_xive_irq); + +void pnv_ocxl_free_xive_irq(u32 irq) +{ + xive_native_free_irq(irq); +} +EXPORT_SYMBOL_GPL(pnv_ocxl_free_xive_irq); diff --git a/drivers/misc/ocxl/afu_irq.c b/drivers/misc/ocxl/afu_irq.c new file mode 100644 index ..f40d853de401 --- /dev/null +++ b/drivers/misc/ocxl/afu_irq.c @@ -0,0 +1,197 @@ +// SPDX-License-Identifier: GPL-2.0+ +// Copyright 2017 IBM Corp. +#include +#include +#include +#include "ocxl_internal.h" + +struct afu_irq { + int id; + int hw_irq; + unsigned int virq; + char *name; + u64 trigger_page; + struct eventfd_ctx *ev_ctx; +}; + +static int irq_offset_to_id(struct ocxl_context *ctx, u64 offset) +{ + return (offset - ctx->afu->irq_base_offset) >> PAGE_SHIFT; +} + +static u64 irq_id_to_offset(struct ocxl_context *ctx, int id) +{ + return ctx->afu->irq_base_offset + (id << PAGE_SHIFT); +} + +static irqreturn_t afu_irq_handler(int virq, void *data) +{ + struct afu_irq *irq = (struct afu_irq *) data; + + if (irq->ev_ctx) + eventfd_signal(irq->ev_ctx, 1); + return IRQ_HANDLED; +} + +static int setup_afu_irq(struct ocxl_context *ctx, struct afu_irq *irq) +{ + int rc; + + irq->virq = irq_create_mapping(NULL, irq->hw_irq); + if (!irq->virq) { + pr_err("irq_create_mapping failed\n"); + return -ENOMEM; + } + pr_debug("hw_irq %d mapped to virq %u\n", irq->hw_irq, irq->virq); + + irq->name = kasprintf(GFP_KERNEL, "ocxl-afu-%u", irq->virq); + if (!irq->name) { + irq_dispose_mapping(irq->virq); + return -ENOMEM; + } + + rc = request_irq(irq->virq, afu_irq_handler, 0, irq->name, irq); + if (rc) { + kfree(irq->name); + irq->name = NULL; + irq_dispose_mapping(irq->virq); + pr_err("request_irq failed: %d\n", rc); + return rc; + } + return 0; +} + +static void release_afu_irq(struct afu_irq *irq) +{ + free_irq(irq->virq, irq); + irq_dispose_mapping(irq->virq); + kfree(irq->name); +} + +int ocxl_afu_irq_alloc(struct ocxl_context *ctx, u64 *irq_offset) +{ + struct afu_irq *irq; + int rc; + + irq = kzal
[PATCH v2 12/13] ocxl: Documentation
ocxl.rst gives a quick, high-level view of opencapi. Update ioctl-number.txt to reflect ioctl numbers being used by the ocxl driver Signed-off-by: Frederic Barrat --- Documentation/ABI/testing/sysfs-class-ocxl | 35 +++ Documentation/accelerators/ocxl.rst| 160 + Documentation/ioctl/ioctl-number.txt | 1 + 3 files changed, 196 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-class-ocxl create mode 100644 Documentation/accelerators/ocxl.rst diff --git a/Documentation/ABI/testing/sysfs-class-ocxl b/Documentation/ABI/testing/sysfs-class-ocxl new file mode 100644 index ..ac11deb71235 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-class-ocxl @@ -0,0 +1,35 @@ +What: /sys/class/ocxl//afu_version +Date: January 2018 +Contact:linuxppc-...@lists.ozlabs.org +Description:read only +Version of the AFU, in the format : + Reflects what is read in the configuration space of the AFU + +What: /sys/class/ocxl//contexts +Date: January 2018 +Contact:linuxppc-...@lists.ozlabs.org +Description:read only + Number of contexts for the AFU, in the format / + where: + n: number of currently active contexts, for debug + max: maximum number of contexts supported by the AFU + +What: /sys/class/ocxl//pp_mmio_size +Date: January 2018 +Contact:linuxppc-...@lists.ozlabs.org +Description:read only + Size of the per-process mmio area, as defined in the + configuration space of the AFU + +What: /sys/class/ocxl//global_mmio_size +Date: January 2018 +Contact:linuxppc-...@lists.ozlabs.org +Description:read only + Size of the global mmio area, as defined in the + configuration space of the AFU + +What: /sys/class/ocxl//global_mmio_area +Date: January 2018 +Contact:linuxppc-...@lists.ozlabs.org +Description:read/write + Give access the global mmio area for the AFU diff --git a/Documentation/accelerators/ocxl.rst b/Documentation/accelerators/ocxl.rst new file mode 100644 index ..4f7af841d935 --- /dev/null +++ b/Documentation/accelerators/ocxl.rst @@ -0,0 +1,160 @@ + +OpenCAPI (Open Coherent Accelerator Processor Interface) + + +OpenCAPI is an interface between processors and accelerators. It aims +at being low-latency and high-bandwidth. The specification is +developed by the `OpenCAPI Consortium <http://opencapi.org/>`_. + +It allows an accelerator (which could be a FPGA, ASICs, ...) to access +the host memory coherently, using virtual addresses. An OpenCAPI +device can also host its own memory, that can be accessed from the +host. + +OpenCAPI is known in linux as 'ocxl', as the open, processor-agnostic +evolution of 'cxl' (the driver for the IBM CAPI interface for +powerpc), which was named that way to avoid confusion with the ISDN +CAPI subsystem. + + +High-level view +=== + +OpenCAPI defines a Data Link Layer (DL) and Transaction Layer (TL), to +be implemented on top of a physical link. Any processor or device +implementing the DL and TL can start sharing memory. + +:: + + +---+ +-+ + | | | | + | | | Accelerated | + | Processor | | Function | + | | ++ |Unit | ++ + | |--| Memory | |(AFU)|--| Memory | + | | ++ | | ++ + +---+ +-+ + | | + +---+ +-+ + |TL | |TLX | + +---+ +-+ + | | + +---+ +-+ + |DL | |DLX | + +---+ +-+ + | | + | PHY | + +---+ + + + +Device discovery + + +OpenCAPI relies on a PCI-like configuration space, implemented on the +device. So the host can discover AFUs by querying the config space. + +OpenCAPI devices in Linux are treated like PCI devices (with a few +caveats). The firmware is expected to abstract the hardware as if it +was a PCI link. A lot of the existing PCI infrastructure is reused: +devices are scanned and BARs are assigned during the standa
[PATCH v2 09/13] ocxl: Add trace points
Define a few trace points so that we can use the standard tracing mechanism for debug and/or monitoring. Signed-off-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> --- drivers/misc/ocxl/afu_irq.c | 5 ++ drivers/misc/ocxl/context.c | 2 + drivers/misc/ocxl/link.c| 11 ++- drivers/misc/ocxl/trace.c | 6 ++ drivers/misc/ocxl/trace.h | 182 5 files changed, 205 insertions(+), 1 deletion(-) create mode 100644 drivers/misc/ocxl/trace.c create mode 100644 drivers/misc/ocxl/trace.h diff --git a/drivers/misc/ocxl/afu_irq.c b/drivers/misc/ocxl/afu_irq.c index f40d853de401..e70cfa24577f 100644 --- a/drivers/misc/ocxl/afu_irq.c +++ b/drivers/misc/ocxl/afu_irq.c @@ -4,6 +4,7 @@ #include #include #include "ocxl_internal.h" +#include "trace.h" struct afu_irq { int id; @@ -28,6 +29,7 @@ static irqreturn_t afu_irq_handler(int virq, void *data) { struct afu_irq *irq = (struct afu_irq *) data; + trace_ocxl_afu_irq_receive(virq); if (irq->ev_ctx) eventfd_signal(irq->ev_ctx, 1); return IRQ_HANDLED; @@ -102,6 +104,8 @@ int ocxl_afu_irq_alloc(struct ocxl_context *ctx, u64 *irq_offset) *irq_offset = irq_id_to_offset(ctx, irq->id); + trace_ocxl_afu_irq_alloc(ctx->pasid, irq->id, irq->virq, irq->hw_irq, + *irq_offset); mutex_unlock(>irq_lock); return 0; @@ -117,6 +121,7 @@ int ocxl_afu_irq_alloc(struct ocxl_context *ctx, u64 *irq_offset) static void afu_irq_free(struct afu_irq *irq, struct ocxl_context *ctx) { + trace_ocxl_afu_irq_free(ctx->pasid, irq->id); if (ctx->mapping) unmap_mapping_range(ctx->mapping, irq_id_to_offset(ctx, irq->id), diff --git a/drivers/misc/ocxl/context.c b/drivers/misc/ocxl/context.c index 269149490063..909e8807824a 100644 --- a/drivers/misc/ocxl/context.c +++ b/drivers/misc/ocxl/context.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0+ // Copyright 2017 IBM Corp. #include +#include "trace.h" #include "ocxl_internal.h" struct ocxl_context *ocxl_context_alloc(void) @@ -214,6 +215,7 @@ int ocxl_context_detach(struct ocxl_context *ctx) mutex_lock(>afu->afu_control_lock); rc = ocxl_config_terminate_pasid(dev, afu_control_pos, ctx->pasid); mutex_unlock(>afu->afu_control_lock); + trace_ocxl_terminate_pasid(ctx->pasid, rc); if (rc) { /* * If we timeout waiting for the AFU to terminate the diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c index fbca3feec592..f30790582dc0 100644 --- a/drivers/misc/ocxl/link.c +++ b/drivers/misc/ocxl/link.c @@ -7,6 +7,7 @@ #include #include #include "ocxl_internal.h" +#include "trace.h" #define SPA_PASID_BITS 15 @@ -116,8 +117,11 @@ static void ack_irq(struct spa *spa, enum xsl_response r) else WARN(1, "Invalid irq response %d\n", r); - if (reg) + if (reg) { + trace_ocxl_fault_ack(spa->spa_mem, spa->xsl_fault.pe, + spa->xsl_fault.dsisr, spa->xsl_fault.dar, reg); out_be64(spa->reg_tfc, reg); + } } static void xsl_fault_handler_bh(struct work_struct *fault_work) @@ -182,6 +186,7 @@ static irqreturn_t xsl_fault_handler(int irq, void *data) int lpid, pid, tid; read_irq(spa, , , _handle); + trace_ocxl_fault(spa->spa_mem, pe_handle, dsisr, dar, -1); WARN_ON(pe_handle > SPA_PE_MASK); pe = spa->spa_mem + pe_handle; @@ -532,6 +537,7 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr, * the problem. */ mmgrab(mm); + trace_ocxl_context_add(current->pid, spa->spa_mem, pasid, pidr, tidr); unlock: mutex_unlock(>spa_lock); return rc; @@ -577,6 +583,9 @@ int ocxl_link_remove_pe(void *link_handle, int pasid) goto unlock; } + trace_ocxl_context_remove(current->pid, spa->spa_mem, pasid, + be32_to_cpu(pe->pid), be32_to_cpu(pe->tid)); + memset(pe, 0, sizeof(struct ocxl_process_element)); /* * The barrier makes sure the PE is removed from the SPA diff --git a/drivers/misc/ocxl/trace.c b/drivers/misc/ocxl/trace.c new file mode 100644 index ..1e6947049697 --- /dev/null +++ b/drivers/misc/ocxl/trace.c @@ -0,0 +1,6 @@ +// SPDX-License-Identifier: GPL-2.0+ +// Copyright 2017 IBM Corp. +#ifndef __CHECKER__ +#define CREATE_TRACE_POINTS +#include "trace.h" +#endif diff --git a/drivers/misc/ocxl/trace.h b/drivers/misc/ocxl/trace.h new file mode 100644 index ..bcb7ff330c1e --- /dev/null +++ b/drivers/misc