Re: [PATCH v5 06/13] iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC
On Mon, 28 Oct 2024 at 11:24, Zhangfei Gao wrote: > By the way, has qemu changed compared with v3? > I still got a hardware error in this version, in check Found iommufd_viommu_p2-v5 misses some patches, Simply tested ok with iommufd_viommu_p2-v5-with-rmr, (with some hacks) Thanks
Re: [PATCH v5 06/13] iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC
On Sat, 26 Oct 2024 at 07:50, Nicolin Chen wrote: > > Now a vIOMMU holds a shareable nesting parent HWPT. So, it can act like > that nesting parent HWPT to allocate a nested HWPT. > > Support that in the IOMMU_HWPT_ALLOC ioctl handler, and update its kdoc. > > Also, add an iommufd_hwpt_nested_alloc_for_viommu helper to allocate a > nested HWPT for a vIOMMU object. Since a vIOMMU object holds the parent > hwpt's refcount already, increase the refcount of the vIOMMU only. > > Signed-off-by: Nicolin Chen > --- > drivers/iommu/iommufd/iommufd_private.h | 1 + > include/uapi/linux/iommufd.h| 14 ++--- > drivers/iommu/iommufd/hw_pagetable.c| 71 - > 3 files changed, 79 insertions(+), 7 deletions(-) > > diff --git a/drivers/iommu/iommufd/iommufd_private.h > b/drivers/iommu/iommufd/iommufd_private.h > index 9adf8d616796..8c9ab35eaea5 100644 > --- a/drivers/iommu/iommufd/iommufd_private.h > +++ b/drivers/iommu/iommufd/iommufd_private.h > @@ -288,6 +288,7 @@ struct iommufd_hwpt_paging { > struct iommufd_hwpt_nested { > struct iommufd_hw_pagetable common; > struct iommufd_hwpt_paging *parent; > + struct iommufd_viommu *viommu; > }; > > static inline bool hwpt_is_paging(struct iommufd_hw_pagetable *hwpt) > diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h > index 3d320d069654..717659b9fdce 100644 > --- a/include/uapi/linux/iommufd.h > +++ b/include/uapi/linux/iommufd.h > @@ -430,7 +430,7 @@ enum iommu_hwpt_data_type { > * @size: sizeof(struct iommu_hwpt_alloc) > * @flags: Combination of enum iommufd_hwpt_alloc_flags > * @dev_id: The device to allocate this HWPT for > - * @pt_id: The IOAS or HWPT to connect this HWPT to > + * @pt_id: The IOAS or HWPT or vIOMMU to connect this HWPT to > * @out_hwpt_id: The ID of the new HWPT > * @__reserved: Must be 0 > * @data_type: One of enum iommu_hwpt_data_type > @@ -449,11 +449,13 @@ enum iommu_hwpt_data_type { > * IOMMU_HWPT_DATA_NONE. The HWPT can be allocated as a parent HWPT for a > * nesting configuration by passing IOMMU_HWPT_ALLOC_NEST_PARENT via @flags. > * > - * A user-managed nested HWPT will be created from a given parent HWPT via > - * @pt_id, in which the parent HWPT must be allocated previously via the > - * same ioctl from a given IOAS (@pt_id). In this case, the @data_type > - * must be set to a pre-defined type corresponding to an I/O page table > - * type supported by the underlying IOMMU hardware. > + * A user-managed nested HWPT will be created from a given vIOMMU (wrapping a > + * parent HWPT) or a parent HWPT via @pt_id, in which the parent HWPT must be > + * allocated previously via the same ioctl from a given IOAS (@pt_id). In > this > + * case, the @data_type must be set to a pre-defined type corresponding to an > + * I/O page table type supported by the underlying IOMMU hardware. The device > + * via @dev_id and the vIOMMU via @pt_id must be associated to the same IOMMU > + * instance. > * > * If the @data_type is set to IOMMU_HWPT_DATA_NONE, @data_len and > * @data_uptr should be zero. Otherwise, both @data_len and @data_uptr > diff --git a/drivers/iommu/iommufd/hw_pagetable.c > b/drivers/iommu/iommufd/hw_pagetable.c > index d06bf6e6c19f..1df5d40c93df 100644 > --- a/drivers/iommu/iommufd/hw_pagetable.c > +++ b/drivers/iommu/iommufd/hw_pagetable.c > @@ -57,7 +57,10 @@ void iommufd_hwpt_nested_destroy(struct iommufd_object > *obj) > container_of(obj, struct iommufd_hwpt_nested, common.obj); > > __iommufd_hwpt_destroy(&hwpt_nested->common); > - refcount_dec(&hwpt_nested->parent->common.obj.users); > + if (hwpt_nested->viommu) > + refcount_dec(&hwpt_nested->viommu->obj.users); > + else > + refcount_dec(&hwpt_nested->parent->common.obj.users); > } > > void iommufd_hwpt_nested_abort(struct iommufd_object *obj) > @@ -260,6 +263,56 @@ iommufd_hwpt_nested_alloc(struct iommufd_ctx *ictx, > return ERR_PTR(rc); > } > > +/** > + * iommufd_viommu_alloc_hwpt_nested() - Get a hwpt_nested for a vIOMMU > + * @viommu: vIOMMU ojbect to associate the hwpt_nested/domain with > + * @user_data: user_data pointer. Must be valid > + * > + * Allocate a new IOMMU_DOMAIN_NESTED for a vIOMMU and return it as a NESTED > + * hw_pagetable. > + */ > +static struct iommufd_hwpt_nested * > +iommufd_viommu_alloc_hwpt_nested(struct iommufd_viommu *viommu, u32 flags, > +const struct iommu_user_data *user_data) > +{ > + struct iommufd_hwpt_nested *hwpt_nested; > + struct iommufd_hw_pagetable *hwpt; > + int rc; > + > + if (flags) > + return ERR_PTR(-EOPNOTSUPP); This check should be removed. When a user page fault is required, IOMMU_HWPT_FAULT_ID_VALID is set. if (cmd->flags & IOMMU_HWPT_FAULT_ID_VALID) { By the way, has qemu changed compared with v3? I still got a hardware error in this version, in check Thanks
Re: [PATCH v5 07/13] iommufd/viommu: Add iommufd_viommu_find_dev helper
On Sat, 26 Oct 2024 at 07:51, Nicolin Chen wrote: > > This avoids a bigger trouble of exposing struct iommufd_device and struct > iommufd_vdevice in the public header. > > Signed-off-by: Nicolin Chen > --- > include/linux/iommufd.h| 8 > drivers/iommu/iommufd/driver.c | 13 + > 2 files changed, 21 insertions(+) > > diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h > index 0287a6d00192..dc174d02132b 100644 > --- a/include/linux/iommufd.h > +++ b/include/linux/iommufd.h > @@ -184,6 +184,8 @@ static inline int iommufd_vfio_compat_set_no_iommu(struct > iommufd_ctx *ictx) > struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx, > size_t size, > enum iommufd_object_type type); > +struct device *iommufd_viommu_find_dev(struct iommufd_viommu *viommu, > + unsigned long vdev_id); > #else /* !CONFIG_IOMMUFD_DRIVER */ > static inline struct iommufd_object * > _iommufd_object_alloc(struct iommufd_ctx *ictx, size_t size, > @@ -191,6 +193,12 @@ _iommufd_object_alloc(struct iommufd_ctx *ictx, size_t > size, > { > return ERR_PTR(-EOPNOTSUPP); > } > + > +static inline struct device * > +iommufd_viommu_find_dev(struct iommufd_viommu *viommu, unsigned long vdev_id) > +{ > + return NULL; > +} > #endif /* CONFIG_IOMMUFD_DRIVER */ > > /* > diff --git a/drivers/iommu/iommufd/driver.c b/drivers/iommu/iommufd/driver.c > index c0876d3f91c7..3604443f82a1 100644 > --- a/drivers/iommu/iommufd/driver.c > +++ b/drivers/iommu/iommufd/driver.c > @@ -36,3 +36,16 @@ struct iommufd_object *_iommufd_object_alloc(struct > iommufd_ctx *ictx, > return ERR_PTR(rc); > } > EXPORT_SYMBOL_NS_GPL(_iommufd_object_alloc, IOMMUFD); > + > +/* Caller should xa_lock(&viommu->vdevs) to protect the return value */ > +struct device *iommufd_viommu_find_dev(struct iommufd_viommu *viommu, > + unsigned long vdev_id) > +{ > + struct iommufd_vdevice *vdev; > + > + lockdep_is_held(&viommu->vdevs.xa_lock); > + > + vdev = xa_load(&viommu->vdevs, vdev_id); > + return vdev ? vdev->idev->dev : NULL; > +} Got this error? ld: Unexpected GOT/PLT entries detected! ld: Unexpected run-time procedure linkages detected! ld: drivers/iommu/iommufd/driver.o: in function `iommufd_viommu_find_dev': linux/drivers/iommu/iommufd/driver.c:47: undefined reference to `lockdep_is_held' make[2]: *** [scripts/Makefile.vmlinux:34: vmlinux] Error 1 make[1]: *** [/home/linaro/iommufd/linux/Makefile:1166: vmlinux] Error 2 make: *** [Makefile:224: __sub-make] Error 2
Re: [PATCH v3 03/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
On Wed, 16 Oct 2024 at 14:52, Nicolin Chen wrote: > > On Wed, Oct 16, 2024 at 09:56:51AM +0800, Zhangfei Gao wrote: > > On Wed, 16 Oct 2024 at 02:44, Nicolin Chen wrote: > > > > > > On Mon, Oct 14, 2024 at 07:01:40PM -0700, Nicolin Chen wrote: > > > > On Tue, Oct 15, 2024 at 09:15:01AM +0800, Zhangfei Gao wrote: > > > > > > > > > > > iommufd_device_bind > > > > > > > iommufd_device_attach > > > > > > > iommufd_vdevice_alloc_ioctl > > > > > > > > > > > > > > iommufd_device_detach > > > > > > > iommufd_device_unbind // refcount check fail > > > > > > > iommufd_vdevice_destroy ref-- > > > > > > > > > > > > Things should be symmetric. As you suspected, vdevice should be > > > > > > destroyed before iommufd_device_detach. > > > > > > > > > > I am trying based on your for_iommufd_viommu_p2-v3 branch, do you have > > > > > this issue? > > > > > In checking whether close fd before unbind? > > > > > > > > Oops, my bad. I will provide a fix. > > > > > > This should fix the problem: > > > > > > - > > > diff --git a/drivers/iommu/iommufd/device.c > > > b/drivers/iommu/iommufd/device.c > > > index 5fd3dd420290..13100cfea29d 100644 > > > --- a/drivers/iommu/iommufd/device.c > > > +++ b/drivers/iommu/iommufd/device.c > > > @@ -277,6 +277,11 @@ EXPORT_SYMBOL_NS_GPL(iommufd_ctx_has_group, IOMMUFD); > > > */ > > > void iommufd_device_unbind(struct iommufd_device *idev) > > > { > > > + mutex_lock(&idev->igroup->lock); > > > + /* idev->vdev object should be destroyed prior, yet just in > > > case.. */ > > > + if (idev->vdev) > > > + iommufd_object_remove(idev->ictx, NULL, > > > idev->vdev->obj.id, 0); > > > + mutex_unlock(&idev->igroup->lock); > > > iommufd_object_destroy_user(idev->ictx, &idev->obj); > > > } > > > EXPORT_SYMBOL_NS_GPL(iommufd_device_unbind, IOMMUFD); > > > - > > > > Not yet > > [ 574.162112] Unable to handle kernel NULL pointer dereference at > > virtual address 0004 > > [ 574.261102] pc : iommufd_object_remove+0x7c/0x278 > > [ 574.265795] lr : iommufd_device_unbind+0x44/0x98 > > in check > > Hmm, it's kinda odd it crashes inside iommufd_object_remove(). > Did you happen to change something there? > > The added iommufd_object_remove() is equivalent to userspace > calling the destroy ioctl on the vDEVICE object. > Yes, double confirmed, it can solve the issue. The guest can stop and run again The Null pointer may be caused by the added debug. Thanks Nico. > Nicolin
Re: [PATCH v3 03/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
On Wed, 16 Oct 2024 at 02:44, Nicolin Chen wrote: > > On Mon, Oct 14, 2024 at 07:01:40PM -0700, Nicolin Chen wrote: > > On Tue, Oct 15, 2024 at 09:15:01AM +0800, Zhangfei Gao wrote: > > > > > > > iommufd_device_bind > > > > > iommufd_device_attach > > > > > iommufd_vdevice_alloc_ioctl > > > > > > > > > > iommufd_device_detach > > > > > iommufd_device_unbind // refcount check fail > > > > > iommufd_vdevice_destroy ref-- > > > > > > > > Things should be symmetric. As you suspected, vdevice should be > > > > destroyed before iommufd_device_detach. > > > > > > I am trying based on your for_iommufd_viommu_p2-v3 branch, do you have > > > this issue? > > > In checking whether close fd before unbind? > > > > Oops, my bad. I will provide a fix. > > This should fix the problem: > > - > diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c > index 5fd3dd420290..13100cfea29d 100644 > --- a/drivers/iommu/iommufd/device.c > +++ b/drivers/iommu/iommufd/device.c > @@ -277,6 +277,11 @@ EXPORT_SYMBOL_NS_GPL(iommufd_ctx_has_group, IOMMUFD); > */ > void iommufd_device_unbind(struct iommufd_device *idev) > { > + mutex_lock(&idev->igroup->lock); > + /* idev->vdev object should be destroyed prior, yet just in case.. */ > + if (idev->vdev) > + iommufd_object_remove(idev->ictx, NULL, idev->vdev->obj.id, > 0); > + mutex_unlock(&idev->igroup->lock); > iommufd_object_destroy_user(idev->ictx, &idev->obj); > } > EXPORT_SYMBOL_NS_GPL(iommufd_device_unbind, IOMMUFD); > - Not yet [ 574.162112] Unable to handle kernel NULL pointer dereference at virtual address 0004 [ 574.261102] pc : iommufd_object_remove+0x7c/0x278 [ 574.265795] lr : iommufd_device_unbind+0x44/0x98 in check Thanks > > Thanks > Nicolin
Re: [PATCH v3 03/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
On Mon, 14 Oct 2024 at 23:46, Nicolin Chen wrote: > > On Mon, Oct 14, 2024 at 03:58:55PM +0800, Zhangfei Gao wrote: > > > > > > > +struct iommufd_object *iommufd_object_alloc_elm(struct iommufd_ctx > > > > > > *ictx, > > > > > > + size_t size, > > > > > > + enum > > > > > > iommufd_object_type type) > > > > > > +{ > > > > > > + struct iommufd_object *obj; > > > > > > + int rc; > > > > > > + > > > > > > + obj = kzalloc(size, GFP_KERNEL_ACCOUNT); > > > > > > + if (!obj) > > > > > > + return ERR_PTR(-ENOMEM); > > > > > > + obj->type = type; > > > > > > + /* Starts out bias'd by 1 until it is removed from the > > > > > > xarray */ > > > > > > + refcount_set(&obj->shortterm_users, 1); > > > > > > + refcount_set(&obj->users, 1); > > > > > > > > > > here set refcont 1 > > > > > > > > > > iommufd_device_bind -> iommufd_object_alloc(ictx, idev, > > > > > IOMMUFD_OBJ_DEVICE): refcont -> 1 > > > > > refcount_inc(&idev->obj.users); refcount -> 2 > > > > > will cause iommufd_device_unbind fail. > > > > > > > > > > May remove refcount_inc(&idev->obj.users) in iommufd_device_bind > > > > > > > > Hmm, why would it fail? Or is it failing on your system? > > > > > > Not sure, still in check, it may only be on my platform. > > > > > > it hit > > > iommufd_object_remove: > > > if (WARN_ON(obj != to_destroy)) > > > > > > iommufd_device_bind refcount=2 > > > iommufd_device_attach refcount=3 > > > //still not sure which operation inc the count? > > > iommufd_device_detach refcount=4 > > > > > > > Have a question, > > when should iommufd_vdevice_destroy be called, before or after > > iommufd_device_unbind. > > Before. > > > Now iommufd_vdevice_destroy (ref--) is after unbind, hits the if > > (!refcount_dec_if_one(&obj->users)) check. > > Hmm, where do we have an iommufd_vdevice_destroy after unbind? Looks like it is called by close fd? [ 2480.909319] iommufd_vdevice_destroy+0xdc/0x168 [ 2480.909324] iommufd_fops_release+0xa4/0x140 [ 2480.909328] __fput+0xd0/0x2e0 [ 2480.909331] fput+0x1c/0x30 [ 2480.909333] task_work_run+0x78/0xe0 [ 2480.909337] do_exit+0x2fc/0xa50 [ 2480.909340] do_group_exit+0x3c/0xa0 [ 2480.909344] get_signal+0x96c/0x978 [ 2480.909346] do_signal+0x94/0x3a8 [ 2480.909348] do_notify_resume+0x100/0x190 > > > iommufd_device_bind > > iommufd_device_attach > > iommufd_vdevice_alloc_ioctl > > > > iommufd_device_detach > > iommufd_device_unbind // refcount check fail > > iommufd_vdevice_destroy ref-- > > Things should be symmetric. As you suspected, vdevice should be > destroyed before iommufd_device_detach. I am trying based on your for_iommufd_viommu_p2-v3 branch, do you have this issue? In checking whether close fd before unbind? > > A vdev is an object on top of a vIOMMU obj and an idev obj, so > it takes a refcount from each of them. That's why idev couldn't > unbind. Thanks > > Thanks > Nicolin
Re: [PATCH v3 03/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
Hi, Nico On Sat, 12 Oct 2024 at 18:18, Zhangfei Gao wrote: > > On Sat, 12 Oct 2024 at 12:49, Nicolin Chen wrote: > > > > On Sat, Oct 12, 2024 at 11:23:07AM +0800, Zhangfei Gao wrote: > > > > > > diff --git a/drivers/iommu/iommufd/viommu_api.c > > > > b/drivers/iommu/iommufd/viommu_api.c > > > > new file mode 100644 > > > > index ..c1731f080d6b > > > > --- /dev/null > > > > +++ b/drivers/iommu/iommufd/viommu_api.c > > > > @@ -0,0 +1,57 @@ > > > > +// SPDX-License-Identifier: GPL-2.0-only > > > > +/* Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES > > > > + */ > > > > + > > > > +#include "iommufd_private.h" > > > > + > > > > +struct iommufd_object *iommufd_object_alloc_elm(struct iommufd_ctx > > > > *ictx, > > > > + size_t size, > > > > + enum > > > > iommufd_object_type type) > > > > +{ > > > > + struct iommufd_object *obj; > > > > + int rc; > > > > + > > > > + obj = kzalloc(size, GFP_KERNEL_ACCOUNT); > > > > + if (!obj) > > > > + return ERR_PTR(-ENOMEM); > > > > + obj->type = type; > > > > + /* Starts out bias'd by 1 until it is removed from the xarray */ > > > > + refcount_set(&obj->shortterm_users, 1); > > > > + refcount_set(&obj->users, 1); > > > > > > here set refcont 1 > > > > > > iommufd_device_bind -> iommufd_object_alloc(ictx, idev, > > > IOMMUFD_OBJ_DEVICE): refcont -> 1 > > > refcount_inc(&idev->obj.users); refcount -> 2 > > > will cause iommufd_device_unbind fail. > > > > > > May remove refcount_inc(&idev->obj.users) in iommufd_device_bind > > > > Hmm, why would it fail? Or is it failing on your system? > > Not sure, still in check, it may only be on my platform. > > it hit > iommufd_object_remove: > if (WARN_ON(obj != to_destroy)) > > iommufd_device_bind refcount=2 > iommufd_device_attach refcount=3 > //still not sure which operation inc the count? > iommufd_device_detach refcount=4 > Have a question, when should iommufd_vdevice_destroy be called, before or after iommufd_device_unbind. Now iommufd_vdevice_destroy (ref--) is after unbind, hits the if (!refcount_dec_if_one(&obj->users)) check. iommufd_device_bind iommufd_device_attach iommufd_vdevice_alloc_ioctl iommufd_device_detach iommufd_device_unbind // refcount check fail iommufd_vdevice_destroy ref-- Thanks
Re: [PATCH v3 03/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
On Sat, 12 Oct 2024 at 12:49, Nicolin Chen wrote: > > On Sat, Oct 12, 2024 at 11:23:07AM +0800, Zhangfei Gao wrote: > > > > diff --git a/drivers/iommu/iommufd/viommu_api.c > > > b/drivers/iommu/iommufd/viommu_api.c > > > new file mode 100644 > > > index ..c1731f080d6b > > > --- /dev/null > > > +++ b/drivers/iommu/iommufd/viommu_api.c > > > @@ -0,0 +1,57 @@ > > > +// SPDX-License-Identifier: GPL-2.0-only > > > +/* Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES > > > + */ > > > + > > > +#include "iommufd_private.h" > > > + > > > +struct iommufd_object *iommufd_object_alloc_elm(struct iommufd_ctx *ictx, > > > + size_t size, > > > + enum iommufd_object_type > > > type) > > > +{ > > > + struct iommufd_object *obj; > > > + int rc; > > > + > > > + obj = kzalloc(size, GFP_KERNEL_ACCOUNT); > > > + if (!obj) > > > + return ERR_PTR(-ENOMEM); > > > + obj->type = type; > > > + /* Starts out bias'd by 1 until it is removed from the xarray */ > > > + refcount_set(&obj->shortterm_users, 1); > > > + refcount_set(&obj->users, 1); > > > > here set refcont 1 > > > > iommufd_device_bind -> iommufd_object_alloc(ictx, idev, > > IOMMUFD_OBJ_DEVICE): refcont -> 1 > > refcount_inc(&idev->obj.users); refcount -> 2 > > will cause iommufd_device_unbind fail. > > > > May remove refcount_inc(&idev->obj.users) in iommufd_device_bind > > Hmm, why would it fail? Or is it failing on your system? Not sure, still in check, it may only be on my platform. it hit iommufd_object_remove: if (WARN_ON(obj != to_destroy)) iommufd_device_bind refcount=2 iommufd_device_attach refcount=3 //still not sure which operation inc the count? iommufd_device_detach refcount=4 Thanks > > This patch doesn't change the function but only moved it.. > > Thanks > Nicolin
Re: [PATCH v3 03/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
On Thu, 10 Oct 2024 at 00:40, Nicolin Chen wrote: > > Add a new IOMMUFD_OBJ_VIOMMU with an iommufd_viommu structure to represent > a slice of physical IOMMU device passed to or shared with a user space VM. > This slice, now a vIOMMU object, is a group of virtualization resources of > a physical IOMMU's, such as: > - Security namespace for guest owned ID, e.g. guest-controlled cache tags > - Access to a sharable nesting parent pagetable across physical IOMMUs > - Virtualization of various platforms IDs, e.g. RIDs and others > - Delivery of paravirtualized invalidation > - Direct assigned invalidation queues > - Direct assigned interrupts > - Non-affiliated event reporting > > Add a new viommu_alloc op in iommu_ops, for drivers to allocate their own > vIOMMU structures. And this allocation also needs a free(), so add struct > iommufd_viommu_ops. > > To simplify a vIOMMU allocation, provide a iommufd_viommu_alloc() helper. > It's suggested that a driver should embed a core-level viommu structure in > its driver-level viommu struct and call the iommufd_viommu_alloc() helper, > meanwhile the driver can also implement a viommu ops: > struct my_driver_viommu { > struct iommufd_viommu core; > /* driver-owned properties/features */ > > }; > > static const struct iommufd_viommu_ops my_driver_viommu_ops = { > .free = my_driver_viommu_free, > /* future ops for virtualization features */ > > }; > > static struct iommufd_viommu my_driver_viommu_alloc(...) > { > struct my_driver_viommu *my_viommu = > iommufd_viommu_alloc(ictx, my_driver_viommu, core, > my_driver_viommu_ops); > /* Init my_viommu and related HW feature */ > > return &my_viommu->core; > } > > static struct iommu_domain_ops my_driver_domain_ops = { > > .viommu_alloc = my_driver_viommu_alloc, > }; > > To make the Kernel config work between a driver and the iommufd core, put > the for-driver allocation helpers into a new viommu_api file building with > CONFIG_IOMMUFD_DRIVER. > > Signed-off-by: Nicolin Chen > diff --git a/drivers/iommu/iommufd/viommu_api.c > b/drivers/iommu/iommufd/viommu_api.c > new file mode 100644 > index ..c1731f080d6b > --- /dev/null > +++ b/drivers/iommu/iommufd/viommu_api.c > @@ -0,0 +1,57 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES > + */ > + > +#include "iommufd_private.h" > + > +struct iommufd_object *iommufd_object_alloc_elm(struct iommufd_ctx *ictx, > + size_t size, > + enum iommufd_object_type type) > +{ > + struct iommufd_object *obj; > + int rc; > + > + obj = kzalloc(size, GFP_KERNEL_ACCOUNT); > + if (!obj) > + return ERR_PTR(-ENOMEM); > + obj->type = type; > + /* Starts out bias'd by 1 until it is removed from the xarray */ > + refcount_set(&obj->shortterm_users, 1); > + refcount_set(&obj->users, 1); here set refcont 1 iommufd_device_bind -> iommufd_object_alloc(ictx, idev, IOMMUFD_OBJ_DEVICE): refcont -> 1 refcount_inc(&idev->obj.users); refcount -> 2 will cause iommufd_device_unbind fail. May remove refcount_inc(&idev->obj.users) in iommufd_device_bind Thanks
Re: [PATCH v3 14/16] iommu/arm-smmu-v3: Add arm_vsmmu_cache_invalidate
On Thu, 10 Oct 2024 at 00:44, Nicolin Chen wrote: > > Implement the vIOMMU's cache_invalidate op for user space to invalidate the > IOTLB entries, Device ATS and CD entries that are still cached by hardware. > > Add struct iommu_viommu_arm_smmuv3_invalidate defining invalidation entries > that are simply in the native format of a 128-bit TLBI command. Scan those > commands against the permitted command list and fix their VMID/SID fields. > > Co-developed-by: Eric Auger > Signed-off-by: Eric Auger > Co-developed-by: Jason Gunthorpe > Signed-off-by: Jason Gunthorpe > Signed-off-by: Nicolin Chen > --- > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 + > include/uapi/linux/iommufd.h | 24 > .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 131 +- > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 +- > 4 files changed, 162 insertions(+), 4 deletions(-) > > +static int > +arm_vsmmu_convert_user_cmd(struct arm_vsmmu *vsmmu, > + struct iommu_viommu_arm_smmuv3_invalidate *cmd) > +{ > + cmd->cmd[0] = le64_to_cpu(cmd->cmd[0]); > + cmd->cmd[1] = le64_to_cpu(cmd->cmd[1]); > + > + switch (cmd->cmd[0] & CMDQ_0_OP) { > + case CMDQ_OP_TLBI_NSNH_ALL: > + /* Convert to NH_ALL */ > + cmd->cmd[0] = CMDQ_OP_TLBI_NH_ALL | > + FIELD_PREP(CMDQ_TLBI_0_VMID, vsmmu->vmid); > + cmd->cmd[1] = 0; > + break; > + case CMDQ_OP_TLBI_NH_VA: > + case CMDQ_OP_TLBI_NH_VAA: > + case CMDQ_OP_TLBI_NH_ALL: > + case CMDQ_OP_TLBI_NH_ASID: > + cmd->cmd[0] &= ~CMDQ_TLBI_0_VMID; > + cmd->cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, vsmmu->vmid); > + break; > + case CMDQ_OP_ATC_INV: > + case CMDQ_OP_CFGI_CD: > + case CMDQ_OP_CFGI_CD_ALL: > + u32 sid, vsid = FIELD_GET(CMDQ_CFGI_0_SID, cmd->cmd[0]); Here got build error drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c:302:3: error: a label can only be part of a statement and a declaration is not a statement 302 | u32 sid, vsid = FIELD_GET(CMDQ_CFGI_0_SID, cmd->cmd[0]); | ^~~ Need {} to include. case CMDQ_OP_CFGI_CD_ALL: { ... } Thanks
Re: [PATCH v2] dmaengine: k3dma: use the correct HiSilicon copyright
On 2021/4/1 下午7:50, Hao Fang wrote: s/Hisilicon/HiSilicon/g. It should use capital S, according to the official website. Signed-off-by: Hao Fang Thanks for the patch. Acked-by: Zhangfei Gao --- V2: -remove the terms of use link. --- drivers/dma/k3dma.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/dma/k3dma.c b/drivers/dma/k3dma.c index d0b2e60..ecdaada9 100644 --- a/drivers/dma/k3dma.c +++ b/drivers/dma/k3dma.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0-only /* * Copyright (c) 2013 - 2015 Linaro Ltd. - * Copyright (c) 2013 Hisilicon Limited. + * Copyright (c) 2013 HiSilicon Limited. */ #include #include @@ -1039,6 +1039,6 @@ static struct platform_driver k3_pdma_driver = { module_platform_driver(k3_pdma_driver); -MODULE_DESCRIPTION("Hisilicon k3 DMA Driver"); +MODULE_DESCRIPTION("HiSilicon k3 DMA Driver"); MODULE_ALIAS("platform:k3dma"); MODULE_LICENSE("GPL v2");
Re: [PATCH] mmc: dw_mmc-k3: use the correct HiSilicon copyright
On Tue, Mar 30, 2021 at 2:46 PM Hao Fang wrote: > > s/Hisilicon/HiSilicon/g. > It should use capital S, according to > https://www.hisilicon.com/en/terms-of-use. > > Signed-off-by: Hao Fang Thanks for the fix. Acked-by: Zhangfei Gao > --- > drivers/mmc/host/dw_mmc-k3.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/mmc/host/dw_mmc-k3.c b/drivers/mmc/host/dw_mmc-k3.c > index 29d2494..0311a37 100644 > --- a/drivers/mmc/host/dw_mmc-k3.c > +++ b/drivers/mmc/host/dw_mmc-k3.c > @@ -1,7 +1,7 @@ > // SPDX-License-Identifier: GPL-2.0-or-later > /* > * Copyright (c) 2013 Linaro Ltd. > - * Copyright (c) 2013 Hisilicon Limited. > + * Copyright (c) 2013 HiSilicon Limited. > */ > > #include > -- > 2.8.1 >
Re: [PATCH v2 3/3] PCI: set dma-can-stall for HiSilicon chip
Hi, Krzysztof On 2021/3/8 上午1:54, Krzysztof Wilczyński wrote: Hi, [...] Property dma-can-stall depends on patchset https://lore.kernel.org/linux-iommu/20210108145217.2254447-1-jean-phili...@linaro.org/ [...] If you plan to post another version of this patch to include the above link into the commit message or reference to the commit itself, as Jean-Philippe's series can already be included in the mainline (since it has been a while now from when this series was originally posted), then I have a favour to ask - would you also be able to also capitalise the subject line (so that it's consistent) and change "chip" to "chips" since there are two you mention in the commit message. Have sent another version with the changes, thanks for the remind. I was waiting for Jean's patchest, https://lore.kernel.org/linux-iommu/20210302092644.2553014-1-jean-phili...@linaro.org/ Though the quirks patches can be applied and build directly on 5.12-rc1. Thanks
[PATCH v3 3/3] PCI: Set dma-can-stall for HiSilicon chips
HiSilicon KunPeng920 and KunPeng930 have devices appear as PCI but are actually on the AMBA bus. These fake PCI devices can support SVA via SMMU stall feature, by setting dma-can-stall for ACPI platforms. Property dma-can-stall depends on patchset https://lore.kernel.org/linux-iommu/20210302092644.2553014-1-jean-phili...@linaro.org/ Signed-off-by: Zhangfei Gao Signed-off-by: Jean-Philippe Brucker Signed-off-by: Zhou Wang --- drivers/pci/quirks.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 873d27f..b866cdf 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -1827,10 +1827,23 @@ DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_HUAWEI, 0x1610, PCI_CLASS_BRIDGE_PCI static void quirk_huawei_pcie_sva(struct pci_dev *pdev) { + struct property_entry properties[] = { + PROPERTY_ENTRY_BOOL("dma-can-stall"), + {}, + }; + if (pdev->revision != 0x21 && pdev->revision != 0x30) return; pdev->pasid_no_tlp = 1; + + /* +* Set the dma-can-stall property on ACPI platforms. Device tree +* can set it directly. +*/ + if (!pdev->dev.of_node && + device_add_properties(&pdev->dev, properties)) + pci_warn(pdev, "could not add stall property"); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); -- 2.9.5
[PATCH v3 2/3] PCI: Add a quirk to set pasid_no_tlp for HiSilicon chips
HiSilicon KunPeng920 and KunPeng930 have devices appear as PCI but are actually on the AMBA bus. These fake PCI devices have PASID capability though not supporting TLP. Add a quirk to set pasid_no_tlp for these devices. Signed-off-by: Zhangfei Gao Signed-off-by: Jean-Philippe Brucker Signed-off-by: Zhou Wang --- drivers/pci/quirks.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 653660e..873d27f 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -1825,6 +1825,20 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_E7525_MCH, quir DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_HUAWEI, 0x1610, PCI_CLASS_BRIDGE_PCI, 8, quirk_pcie_mch); +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ + if (pdev->revision != 0x21 && pdev->revision != 0x30) + return; + + pdev->pasid_no_tlp = 1; +} +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa255, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa256, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa258, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa259, quirk_huawei_pcie_sva); + /* * It's possible for the MSI to get corrupted if SHPC and ACPI are used * together on certain PXH-based systems. -- 2.9.5
[PATCH v3 1/3] PCI: PASID can be enabled without TLP prefix
A PASID-like feature is implemented on AMBA without using TLP prefixes and these devices have PASID capability though not supporting TLP. Adding a pasid_no_tlp bit for "PASID works without TLP prefixes" and pci_enable_pasid() checks pasid_no_tlp as well as eetlp_prefix_path. Suggested-by: Bjorn Helgaas Signed-off-by: Zhangfei Gao --- drivers/pci/ats.c | 2 +- include/linux/pci.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c index 793d381..88f981b 100644 --- a/drivers/pci/ats.c +++ b/drivers/pci/ats.c @@ -380,7 +380,7 @@ int pci_enable_pasid(struct pci_dev *pdev, int features) if (WARN_ON(pdev->pasid_enabled)) return -EBUSY; - if (!pdev->eetlp_prefix_path) + if (!pdev->eetlp_prefix_path && !pdev->pasid_no_tlp) return -EINVAL; if (!pasid) diff --git a/include/linux/pci.h b/include/linux/pci.h index 86c799c..1daa943 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -388,6 +388,7 @@ struct pci_dev { supported from root to here */ u16 l1ss; /* L1SS Capability pointer */ #endif + unsigned intpasid_no_tlp:1; /* PASID works without TLP Prefix */ unsigned inteetlp_prefix_path:1;/* End-to-End TLP Prefix */ pci_channel_state_t error_state;/* Current connectivity state */ -- 2.9.5
[PATCH v3 0/3] PCI: Add a quirk to enable SVA for HiSilicon chip
HiSilicon KunPeng920 and KunPeng930 have devices appear as PCI but are actually on the AMBA bus. These fake PCI devices have PASID capability though not supporting TLP. Add a quirk to set pasid_no_tlp and dma-can-stall for these devices. v3: Rebase to Linux 5.12-rc1 Change commit msg adding: Property dma-can-stall depends on patchset https://lore.kernel.org/linux-iommu/20210302092644.2553014-1-jean-phili...@linaro.org/ By the way the patchset can directly applied on 5.12-rc1 and build successfully though without the dependent patchset. v2: Add a new pci_dev bit: pasid_no_tlp, suggested by Bjorn "Apparently these devices have a PASID capability. I think you should add a new pci_dev bit that is specific to this idea of "PASID works without TLP prefixes" and then change pci_enable_pasid() to look at that bit as well as eetlp_prefix_path." https://lore.kernel.org/linux-pci/20210112170230.GA1838341@bjorn-Precision-5520/ Zhangfei Gao (3): PCI: PASID can be enabled without TLP prefix PCI: Add a quirk to set pasid_no_tlp for HiSilicon chips PCI: Set dma-can-stall for HiSilicon chips drivers/pci/ats.c| 2 +- drivers/pci/quirks.c | 27 +++ include/linux/pci.h | 1 + 3 files changed, 29 insertions(+), 1 deletion(-) -- 2.9.5
[PATCH v2 2/3] PCI: Add a quirk to set pasid_no_tlp for HiSilicon chip
HiSilicon KunPeng920 and KunPeng930 have devices appear as PCI but are actually on the AMBA bus. These fake PCI devices have PASID capability though not supporting TLP. Add a quirk to set pasid_no_tlp for these devices. Signed-off-by: Zhangfei Gao Signed-off-by: Jean-Philippe Brucker Signed-off-by: Zhou Wang --- drivers/pci/quirks.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 653660e..873d27f 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -1825,6 +1825,20 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_E7525_MCH, quir DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_HUAWEI, 0x1610, PCI_CLASS_BRIDGE_PCI, 8, quirk_pcie_mch); +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ + if (pdev->revision != 0x21 && pdev->revision != 0x30) + return; + + pdev->pasid_no_tlp = 1; +} +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa255, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa256, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa258, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa259, quirk_huawei_pcie_sva); + /* * It's possible for the MSI to get corrupted if SHPC and ACPI are used * together on certain PXH-based systems. -- 2.7.4
[PATCH v2 3/3] PCI: set dma-can-stall for HiSilicon chip
HiSilicon KunPeng920 and KunPeng930 have devices appear as PCI but are actually on the AMBA bus. These fake PCI devices can support SVA via SMMU stall feature, by setting dma-can-stall for ACPI platforms. Signed-off-by: Zhangfei Gao Signed-off-by: Jean-Philippe Brucker Signed-off-by: Zhou Wang --- Property dma-can-stall depends on patchset https://lore.kernel.org/linux-iommu/20210108145217.2254447-1-jean-phili...@linaro.org/ drivers/pci/quirks.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 873d27f..b866cdf 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -1827,10 +1827,23 @@ DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_HUAWEI, 0x1610, PCI_CLASS_BRIDGE_PCI static void quirk_huawei_pcie_sva(struct pci_dev *pdev) { + struct property_entry properties[] = { + PROPERTY_ENTRY_BOOL("dma-can-stall"), + {}, + }; + if (pdev->revision != 0x21 && pdev->revision != 0x30) return; pdev->pasid_no_tlp = 1; + + /* +* Set the dma-can-stall property on ACPI platforms. Device tree +* can set it directly. +*/ + if (!pdev->dev.of_node && + device_add_properties(&pdev->dev, properties)) + pci_warn(pdev, "could not add stall property"); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); -- 2.7.4
[PATCH v2 1/3] PCI: PASID can be enabled without TLP prefix
A PASID-like feature is implemented on AMBA without using TLP prefixes and these devices have PASID capability though not supporting TLP. Adding a pasid_no_tlp bit for "PASID works without TLP prefixes" and pci_enable_pasid() checks pasid_no_tlp as well as eetlp_prefix_path. Suggested-by: Bjorn Helgaas Signed-off-by: Zhangfei Gao --- drivers/pci/ats.c | 2 +- include/linux/pci.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c index e36d601..b67b1b1 100644 --- a/drivers/pci/ats.c +++ b/drivers/pci/ats.c @@ -386,7 +386,7 @@ int pci_enable_pasid(struct pci_dev *pdev, int features) if (WARN_ON(pdev->pasid_enabled)) return -EBUSY; - if (!pdev->eetlp_prefix_path) + if (!pdev->eetlp_prefix_path && !pdev->pasid_no_tlp) return -EINVAL; if (!pasid) diff --git a/include/linux/pci.h b/include/linux/pci.h index f1f26f8..ac1c735 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -388,6 +388,7 @@ struct pci_dev { supported from root to here */ u16 l1ss; /* L1SS Capability pointer */ #endif + unsigned intpasid_no_tlp:1; /* PASID works without TLP Prefix */ unsigned inteetlp_prefix_path:1;/* End-to-End TLP Prefix */ pci_channel_state_t error_state;/* Current connectivity state */ -- 2.7.4
[PATCH v2 0/3] PCI: Add a quirk to enable SVA for HiSilicon chip
HiSilicon KunPeng920 and KunPeng930 have devices appear as PCI but are actually on the AMBA bus. These fake PCI devices have PASID capability though not supporting TLP. Add a quirk to set pasid_no_tlp and dma-can-stall for these devices. v2: Add a new pci_dev bit: pasid_no_tlp, suggested by Bjorn "Apparently these devices have a PASID capability. I think you should add a new pci_dev bit that is specific to this idea of "PASID works without TLP prefixes" and then change pci_enable_pasid() to look at that bit as well as eetlp_prefix_path." https://lore.kernel.org/linux-pci/20210112170230.GA1838341@bjorn-Precision-5520/ Zhangfei Gao (3): PCI: PASID can be enabled without TLP prefix PCI: Add a quirk to set pasid_no_tlp for HiSilicon chip PCI: set dma-can-stall for HiSilicon chip drivers/pci/ats.c| 2 +- drivers/pci/quirks.c | 27 +++ include/linux/pci.h | 1 + 3 files changed, 29 insertions(+), 1 deletion(-) -- 2.7.4
Re: [PATCH] PCI: Add a quirk to enable SVA for HiSilicon chip
Hi, Bjorn Thanks for the suggestion. On 2021/1/13 上午1:02, Bjorn Helgaas wrote: On Tue, Jan 12, 2021 at 02:49:52PM +0800, Zhangfei Gao wrote: HiSilicon KunPeng920 and KunPeng930 have devices appear as PCI but are actually on the AMBA bus. These fake PCI devices can not support tlp and have to enable SMMU stall mode to use the SVA feature. Add a quirk to set dma-can-stall property and enable tlp for these devices. s/tlp/TLP/ I don't think "enable TLP" really captures what's going on here. You must be referring to the fact that you set pdev->eetlp_prefix_path. That is normally set by pci_configure_eetlp_prefix() if the Device Capabilities 2 register has the End-End TLP Prefix Supported bit set *and* all devices in the upstream path also have it set. The only place we currently test eetlp_prefix_path is in pci_enable_pasid(). In PCIe, PASID is implemented using the PASID TLP prefix, so we only enable PASID if TLP prefixes are supported. If I understand correctly, a PASID-like feature is implemented on AMBA without using TLP prefixes, and setting eetlp_prefix_path makes that work. Yes, that's the requirement. I don't think you should do this by setting eetlp_prefix_path because TLP prefixes are used for other features, e.g., TPH. Setting eetlp_prefix_path implies these devices can also support things like TLP, and I don't think that's necessarily true. Thanks for the remainder. Apparently these devices have a PASID capability. I think you should add a new pci_dev bit that is specific to this idea of "PASID works without TLP prefixes" and then change pci_enable_pasid() to look at that bit as well as eetlp_prefix_path. That's great, this solution is much simpler. we can set the bit before pci_enable_pasid. It seems like dma-can-stall is a separate thing from PASID? If so, this should be two separate patches. If they can be separated, I would probably make the PASID thing the first patch, and then the "dma-can-stall" can be on its own as a broken DT workaround (if that's what it is) and it's easier to remove that if it become obsolete. Signed-off-by: Zhangfei Gao Signed-off-by: Jean-Philippe Brucker Signed-off-by: Zhou Wang --- Property dma-can-stall depends on patchset https://lore.kernel.org/linux-iommu/20210108145217.2254447-1-jean-phili...@linaro.org/ drivers/pci/quirks.c | 25 + 1 file changed, 25 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 653660e..a27f327 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -1825,6 +1825,31 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_E7525_MCH, quir DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_HUAWEI, 0x1610, PCI_CLASS_BRIDGE_PCI, 8, quirk_pcie_mch); +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ + struct property_entry properties[] = { + PROPERTY_ENTRY_BOOL("dma-can-stall"), + {}, + }; + + if ((pdev->revision != 0x21) && (pdev->revision != 0x30)) + return; + + pdev->eetlp_prefix_path = 1; + + /* Device-tree can set the stall property */ + if (!pdev->dev.of_node && + device_add_properties(&pdev->dev, properties)) Does this mean "dma-can-stall" *can* be set via DT, and if it is, this quirk is not needed? So is this quirk basically a workaround for an old or broken DT? The quirk is still needed for uefi case, since uefi can not describe the endpoints (peripheral devices). + pci_warn(pdev, "could not add stall property"); +} + Remove this blank line to follow the style of the rest of the file. +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa255, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa256, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa258, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa259, quirk_huawei_pcie_sva); + /* * It's possible for the MSI to get corrupted if SHPC and ACPI are used * together on certain PXH-based systems. How about changes like this diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 68f53f7..886ea26 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2466,6 +2466,9 @@ static int arm_smmu_enable_pasid(struct arm_smmu_master *master) if (num_pasids <= 0) return num_pasids; + if (master->stall_enabled) + pdev->pasid_no_tlp = 1; + ret = pci_enable_pasid(pdev, features); if (ret) { dev_err(&pdev->dev, "Fail
[PATCH] PCI: Add a quirk to enable SVA for HiSilicon chip
HiSilicon KunPeng920 and KunPeng930 have devices appear as PCI but are actually on the AMBA bus. These fake PCI devices can not support tlp and have to enable SMMU stall mode to use the SVA feature. Add a quirk to set dma-can-stall property and enable tlp for these devices. Signed-off-by: Zhangfei Gao Signed-off-by: Jean-Philippe Brucker Signed-off-by: Zhou Wang --- Property dma-can-stall depends on patchset https://lore.kernel.org/linux-iommu/20210108145217.2254447-1-jean-phili...@linaro.org/ drivers/pci/quirks.c | 25 + 1 file changed, 25 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 653660e..a27f327 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -1825,6 +1825,31 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_E7525_MCH, quir DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_HUAWEI, 0x1610, PCI_CLASS_BRIDGE_PCI, 8, quirk_pcie_mch); +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ + struct property_entry properties[] = { + PROPERTY_ENTRY_BOOL("dma-can-stall"), + {}, + }; + + if ((pdev->revision != 0x21) && (pdev->revision != 0x30)) + return; + + pdev->eetlp_prefix_path = 1; + + /* Device-tree can set the stall property */ + if (!pdev->dev.of_node && + device_add_properties(&pdev->dev, properties)) + pci_warn(pdev, "could not add stall property"); +} + +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa255, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa256, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa258, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa259, quirk_huawei_pcie_sva); + /* * It's possible for the MSI to get corrupted if SHPC and ACPI are used * together on certain PXH-based systems. -- 2.7.4
Re: [PATCH] uacce: Use kobj_to_dev() instead of container_of()
On 2020/8/20 上午10:16, Tian Tao wrote: Use kobj_to_dev() instead of container_of() Signed-off-by: Tian Tao Reviewed-by: Zhou Wang Acked-by: Zhangfei Gao Thanks --- drivers/misc/uacce/uacce.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c index a5b8dab..a9da7b1 100644 --- a/drivers/misc/uacce/uacce.c +++ b/drivers/misc/uacce/uacce.c @@ -370,7 +370,7 @@ static struct attribute *uacce_dev_attrs[] = { static umode_t uacce_dev_is_visible(struct kobject *kobj, struct attribute *attr, int n) { - struct device *dev = container_of(kobj, struct device, kobj); + struct device *dev = kobj_to_dev(kobj); struct uacce_device *uacce = to_uacce_device(dev); if (((attr == &dev_attr_region_mmio_size.attr) &&
Re: [PATCH] uacce: fix some coding styles
On 2020/7/30 下午2:13, Kai Ye wrote: 1. delete some redundant code. 2. modify the module author information. Signed-off-by: Kai Ye Thanks Kai Acked-by: Zhangfei Gao Thanks
Re: [PATCH] uacce: fix some coding styles
On 2020/7/20 下午3:18, Kai Ye wrote: 1. add some parameter check. 2. delete some redundant code. 3. modify the module author information. Signed-off-by: Kai Ye Reviewed-by: Zhou Wang Thanks Kai. --- drivers/misc/uacce/uacce.c | 28 +--- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c index 107028e..2e1af58 100644 --- a/drivers/misc/uacce/uacce.c +++ b/drivers/misc/uacce/uacce.c @@ -63,8 +63,12 @@ static long uacce_fops_unl_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) { struct uacce_queue *q = filep->private_data; - struct uacce_device *uacce = q->uacce; + struct uacce_device *uacce; + + if (WARN_ON(!q)) + return -EINVAL; WARN_ON should not be used in uacce, instead error can be printed in user space driver. Error should not be printed in kernel log as pasid can be used by unpriv user. And I think we do not need check filep->private_data. The fd is double checked in __fget_files. Thanks
Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU
Hi, Joerg On 2020/6/22 下午7:55, Joerg Roedel wrote: On Thu, Jun 04, 2020 at 09:33:07PM +0800, Zhangfei Gao wrote: +++ b/drivers/iommu/iommu.c @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, fwspec->iommu_fwnode = iommu_fwnode; fwspec->ops = ops; dev_iommu_fwspec_set(dev, fwspec); + + if (dev_is_pci(dev)) + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); + That's not going to fly, I don't think we should run the fixups twice, and they should not be run from IOMMU code. Is the only reason for this second pass that iommu_fwspec is not yet allocated when it runs the first time? I ask because it might be easier to just allocate the struct earlier then. Thanks for looking this. Yes, it is the only reason calling fixup secondly after iommu_fwspec is allocated. The first time fixup final is very early in pci_bus_add_device. If allocating iommu_fwspec earlier, it maybe in pci_alloc_dev. And assigning ops still in iommu_fwspec_init. Have tested it works. Not sure is it acceptable? Alternatively, adding can_stall to struct pci_dev is simple but ugly too, since pci does not know stall now. Thanks
Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU
Hi, Bjorn On 2020/6/16 上午7:52, Bjorn Helgaas wrote: On Sat, Jun 13, 2020 at 10:30:56PM +0800, Zhangfei Gao wrote: On 2020/6/11 下午9:44, Bjorn Helgaas wrote: +++ b/drivers/iommu/iommu.c @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, fwspec->iommu_fwnode = iommu_fwnode; fwspec->ops = ops; dev_iommu_fwspec_set(dev, fwspec); + + if (dev_is_pci(dev)) + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); + Then pci_fixup_final will be called twice, the first in pci_bus_add_device. Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec. Will send this when 5.8-rc1 is open. Wait, this whole fixup approach seems wrong to me. No matter how you do the fixup, it's still a fixup, which means it requires ongoing maintenance. Surely we don't want to have to add the Vendor/Device ID for every new AMBA device that comes along, do we? Here the fake pci device has standard PCI cfg space, but physical implementation is base on AMBA They can provide pasid feature. However, 1, does not support tlp since they are not real pci devices. 2. does not support pri, instead support stall (provided by smmu) And stall is not a pci feature, so it is not described in struct pci_dev, but in struct iommu_fwspec. So we use this fixup to tell pci system that the devices can support stall, and hereby support pasid. This did not answer my question. Are you proposing that we update a quirk every time a new AMBA device is released? I don't think that would be a good model. Yes, you are right, but we do not have any better idea yet. Currently we have three fake pci devices, which support stall and pasid. We have to let pci system know the device can support pasid, because of stall feature, though not support pri. Do you have any other ideas? It sounds like the best way would be to allocate a PCI capability for it, so detection can be done through config space, at least in future devices, or possibly after a firmware update if the config space in your system is controlled by firmware somewhere. Once there is a proper mechanism to do this, using fixups to detect the early devices that don't use that should be uncontroversial. I have no idea what the process or timeline is to add new capabilities into the PCIe specification, or if this one would be acceptable to the PCI SIG at all. That sounds like a possibility. The spec already defines a Vendor-Specific Extended Capability (PCIe r5.0, sec 7.9.5) that might be a candidate. Will investigate this, thanks Bjorn FWIW, there's also a Vendor-Specific Capability that can appear in the first 256 bytes of config space (the Vendor-Specific Extended Capability must appear in the "Extended Configuration Space" from 0x100-0xfff). Unfortunately our silicon does not have either Vendor-Specific Capability or Vendor-Specific Extended Capability. Studied commit 8531e283bee66050734fb0e89d53e85fd5ce24a4 Looks this method requires adding member (like can_stall) to struct pci_dev, looks difficult. The problem is that we don't want to add device IDs every time a new chip comes out. Adding one or two device IDs for silicon that's already released is not a problem as long as you have a strategy for *future* devices so they don't require a quirk. If detection cannot be done through PCI config space, the next best alternative is to pass auxiliary data through firmware. On DT based machines, you can list non-hotpluggable PCIe devices and add custom properties that could be read during device enumeration. I assume ACPI has something similar, but I have not done that. Yes, thanks Arnd ACPI has _DSM (ACPI v6.3, sec 9.1.1), which might be a candidate. I like this better than a PCI capability because the property you need to expose is not a PCI property. _DSM may not workable, since it is working in runtime. We need stall information in init stage, neither too early (after allocation of iommu_fwspec) nor too late (before arm_smmu_add_device ). I'm not aware of a restriction on when _DSM can be evaluated. I'm looking at ACPI v6.3, sec 9.1.1. Are you seeing something different? DSM method seems requires vendor specific guid, and code would be vendor specific. _DSM indeed requires a vendor-specific UUID, precisely *because* vendors are free to define their own functionality without requiring changes to the ACPI spec. From the spec (ACPI v6.3, sec 9.1.1): New UUIDs may also be created by OEMs and IHVs for custom devices and other interface or device governing bodies (e.g. the PCI SIG), as long as the UUID is different from other published UUIDs. Have studied _DSM method, two issues we met comparing using quirk. 1. Need change definition of either pci_host_bridge or pci_dev, like adding member can_stall, while pci system does not know stall now. a, pci devices do not have uuid: uuid nee
[PATCH] uacce: remove uacce_vma_fault
Fix NULL pointer error if removing uacce's parent module during app's running. SIGBUS is already reported by do_page_fault, so uacce_vma_fault is not needed. If providing vma_fault, vmf->page has to be filled as well, required by __do_fault. Reported-by: Jean-Philippe Brucker Signed-off-by: Zhangfei Gao --- drivers/misc/uacce/uacce.c | 9 - 1 file changed, 9 deletions(-) diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c index 107028e..aa91f69 100644 --- a/drivers/misc/uacce/uacce.c +++ b/drivers/misc/uacce/uacce.c @@ -179,14 +179,6 @@ static int uacce_fops_release(struct inode *inode, struct file *filep) return 0; } -static vm_fault_t uacce_vma_fault(struct vm_fault *vmf) -{ - if (vmf->flags & (FAULT_FLAG_MKWRITE | FAULT_FLAG_WRITE)) - return VM_FAULT_SIGBUS; - - return 0; -} - static void uacce_vma_close(struct vm_area_struct *vma) { struct uacce_queue *q = vma->vm_private_data; @@ -199,7 +191,6 @@ static void uacce_vma_close(struct vm_area_struct *vma) } static const struct vm_operations_struct uacce_vm_ops = { - .fault = uacce_vma_fault, .close = uacce_vma_close, }; -- 2.7.4
[PATCH v2] crypto: hisilicon - fix strncpy warning with strscpy
Use strscpy to fix the warning warning: 'strncpy' specified bound 64 equals destination size Reported-by: kernel test robot Signed-off-by: Zhangfei Gao --- v2: Use strscpy instead of strlcpy since better truncation handling suggested by Herbert Rebase to 5.8-rc1 drivers/crypto/hisilicon/qm.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index 9bb263cec6c3..8ac293afa8ab 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -2179,8 +2179,12 @@ static int qm_alloc_uacce(struct hisi_qm *qm) .flags = UACCE_DEV_SVA, .ops = &uacce_qm_ops, }; + int ret; - strncpy(interface.name, pdev->driver->name, sizeof(interface.name)); + ret = strscpy(interface.name, pdev->driver->name, + sizeof(interface.name)); + if (ret < 0) + return -ENAMETOOLONG; uacce = uacce_alloc(&pdev->dev, &interface); if (IS_ERR(uacce)) -- 2.23.0
Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU
On 2020/6/11 下午9:44, Bjorn Helgaas wrote: +++ b/drivers/iommu/iommu.c @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, fwspec->iommu_fwnode = iommu_fwnode; fwspec->ops = ops; dev_iommu_fwspec_set(dev, fwspec); + + if (dev_is_pci(dev)) + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); + Then pci_fixup_final will be called twice, the first in pci_bus_add_device. Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec. Will send this when 5.8-rc1 is open. Wait, this whole fixup approach seems wrong to me. No matter how you do the fixup, it's still a fixup, which means it requires ongoing maintenance. Surely we don't want to have to add the Vendor/Device ID for every new AMBA device that comes along, do we? Here the fake pci device has standard PCI cfg space, but physical implementation is base on AMBA They can provide pasid feature. However, 1, does not support tlp since they are not real pci devices. 2. does not support pri, instead support stall (provided by smmu) And stall is not a pci feature, so it is not described in struct pci_dev, but in struct iommu_fwspec. So we use this fixup to tell pci system that the devices can support stall, and hereby support pasid. This did not answer my question. Are you proposing that we update a quirk every time a new AMBA device is released? I don't think that would be a good model. Yes, you are right, but we do not have any better idea yet. Currently we have three fake pci devices, which support stall and pasid. We have to let pci system know the device can support pasid, because of stall feature, though not support pri. Do you have any other ideas? It sounds like the best way would be to allocate a PCI capability for it, so detection can be done through config space, at least in future devices, or possibly after a firmware update if the config space in your system is controlled by firmware somewhere. Once there is a proper mechanism to do this, using fixups to detect the early devices that don't use that should be uncontroversial. I have no idea what the process or timeline is to add new capabilities into the PCIe specification, or if this one would be acceptable to the PCI SIG at all. That sounds like a possibility. The spec already defines a Vendor-Specific Extended Capability (PCIe r5.0, sec 7.9.5) that might be a candidate. Will investigate this, thanks Bjorn FWIW, there's also a Vendor-Specific Capability that can appear in the first 256 bytes of config space (the Vendor-Specific Extended Capability must appear in the "Extended Configuration Space" from 0x100-0xfff). Unfortunately our silicon does not have either Vendor-Specific Capability or Vendor-Specific Extended Capability. Studied commit 8531e283bee66050734fb0e89d53e85fd5ce24a4 Looks this method requires adding member (like can_stall) to struct pci_dev, looks difficult. If detection cannot be done through PCI config space, the next best alternative is to pass auxiliary data through firmware. On DT based machines, you can list non-hotpluggable PCIe devices and add custom properties that could be read during device enumeration. I assume ACPI has something similar, but I have not done that. Yes, thanks Arnd ACPI has _DSM (ACPI v6.3, sec 9.1.1), which might be a candidate. I like this better than a PCI capability because the property you need to expose is not a PCI property. _DSM may not workable, since it is working in runtime. We need stall information in init stage, neither too early (after allocation of iommu_fwspec) nor too late (before arm_smmu_add_device ). I'm not aware of a restriction on when _DSM can be evaluated. I'm looking at ACPI v6.3, sec 9.1.1. Are you seeing something different? DSM method seems requires vendor specific guid, and code would be vendor specific. Except adding uuid to some spec like pci_acpi_dsm_guid. obj = acpi_evaluate_dsm(ACPI_HANDLE(bus->bridge), &pci_acpi_dsm_guid, 1, IGNORE_PCI_BOOT_CONFIG_DSM, NULL); By the way, It would be a long time if we need modify either pcie spec or acpi spec. Can we use pci_fixup_device in iommu_fwspec_init first, it is relatively simple and meet the requirement of platform device using pasid, and they are already in product. Neither the PCI Vendor-Specific Capability nor the ACPI _DSM requires a spec change. Both can be completely vendor-defined. Adding vendor-specific code to common files looks a bit ugly. Thanks
Re: [RFC PATCH] PCI: Remove End-End TLP as PASID dependency
On 2020/6/12 上午1:41, Bjorn Helgaas wrote: [+cc Sinan] On Wed, Jun 10, 2020 at 12:18:14PM +0800, Zhangfei Gao wrote: Some platform devices appear as PCI and have PCI cfg space, but are actually on the AMBA bus. They can support PASID via smmu stall feature, but does not support tlp since they are not real pci devices. So remove tlp as a PASID dependency. When you iterate on this, pay attention to things like: - Wrap paragraphs to 75 columns or so, so they fill the whole line but don't overflow when "git show" adds 4 spaces. - Leave a blank line between paragraphs. - Capitalize consistently: "SMMU", "PCI", "TLP". - Provide references to relevant spec sections, e.g., for the SMMU stall feature. OK, Thanks Bjorn Signed-off-by: Zhangfei Gao --- drivers/pci/ats.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c index 390e92f..8e31278 100644 --- a/drivers/pci/ats.c +++ b/drivers/pci/ats.c @@ -344,9 +344,6 @@ int pci_enable_pasid(struct pci_dev *pdev, int features) if (WARN_ON(pdev->pasid_enabled)) return -EBUSY; - if (!pdev->eetlp_prefix_path) - return -EINVAL; No. This would mean we might enable PASID on actual PCIe devices when it is not safe to do so, as Jean-Philippe pointed out. You cannot break actual PCIe devices just to make your non-PCIe device work. These devices do not support PASID as defined in the PCIe spec. They might support something *like* PASID, and you might be able to make parts of the PCI core work with them, but you're going to have to deal with the parts that don't follow the PCIe spec on your own. That might be quirks, it might be some sort of AMBA adaptation shim, I don't know. But it's not the responsibility of the PCI core to adapt to them. Understand now. Will continue use quirk for this. Thanks
Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU
On 2020/6/10 上午12:49, Bjorn Helgaas wrote: On Tue, Jun 09, 2020 at 11:15:06AM +0200, Arnd Bergmann wrote: On Tue, Jun 9, 2020 at 6:02 AM Zhangfei Gao wrote: On 2020/6/9 上午12:41, Bjorn Helgaas wrote: On Mon, Jun 08, 2020 at 10:54:15AM +0800, Zhangfei Gao wrote: On 2020/6/6 上午7:19, Bjorn Helgaas wrote: +++ b/drivers/iommu/iommu.c @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, fwspec->iommu_fwnode = iommu_fwnode; fwspec->ops = ops; dev_iommu_fwspec_set(dev, fwspec); + + if (dev_is_pci(dev)) + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); + Then pci_fixup_final will be called twice, the first in pci_bus_add_device. Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec. Will send this when 5.8-rc1 is open. Wait, this whole fixup approach seems wrong to me. No matter how you do the fixup, it's still a fixup, which means it requires ongoing maintenance. Surely we don't want to have to add the Vendor/Device ID for every new AMBA device that comes along, do we? Here the fake pci device has standard PCI cfg space, but physical implementation is base on AMBA They can provide pasid feature. However, 1, does not support tlp since they are not real pci devices. 2. does not support pri, instead support stall (provided by smmu) And stall is not a pci feature, so it is not described in struct pci_dev, but in struct iommu_fwspec. So we use this fixup to tell pci system that the devices can support stall, and hereby support pasid. This did not answer my question. Are you proposing that we update a quirk every time a new AMBA device is released? I don't think that would be a good model. Yes, you are right, but we do not have any better idea yet. Currently we have three fake pci devices, which support stall and pasid. We have to let pci system know the device can support pasid, because of stall feature, though not support pri. Do you have any other ideas? It sounds like the best way would be to allocate a PCI capability for it, so detection can be done through config space, at least in future devices, or possibly after a firmware update if the config space in your system is controlled by firmware somewhere. Once there is a proper mechanism to do this, using fixups to detect the early devices that don't use that should be uncontroversial. I have no idea what the process or timeline is to add new capabilities into the PCIe specification, or if this one would be acceptable to the PCI SIG at all. That sounds like a possibility. The spec already defines a Vendor-Specific Extended Capability (PCIe r5.0, sec 7.9.5) that might be a candidate. Will investigate this, thanks Bjorn If detection cannot be done through PCI config space, the next best alternative is to pass auxiliary data through firmware. On DT based machines, you can list non-hotpluggable PCIe devices and add custom properties that could be read during device enumeration. I assume ACPI has something similar, but I have not done that. Yes, thanks Arnd ACPI has _DSM (ACPI v6.3, sec 9.1.1), which might be a candidate. I like this better than a PCI capability because the property you need to expose is not a PCI property. _DSM may not workable, since it is working in runtime. We need stall information in init stage, neither too early (after allocation of iommu_fwspec) nor too late (before arm_smmu_add_device ). By the way, It would be a long time if we need modify either pcie spec or acpi spec. Can we use pci_fixup_device in iommu_fwspec_init first, it is relatively simple and meet the requirement of platform device using pasid, and they are already in product. Thanks
Re: [RFC PATCH] PCI: Remove End-End TLP as PASID dependency
On 2020/6/10 下午3:46, Jean-Philippe Brucker wrote: On Wed, Jun 10, 2020 at 12:18:14PM +0800, Zhangfei Gao wrote: Some platform devices appear as PCI and have PCI cfg space, but are actually on the AMBA bus. They can support PASID via smmu stall feature, but does not support tlp since they are not real pci devices. So remove tlp as a PASID dependency. Signed-off-by: Zhangfei Gao --- drivers/pci/ats.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c index 390e92f..8e31278 100644 --- a/drivers/pci/ats.c +++ b/drivers/pci/ats.c @@ -344,9 +344,6 @@ int pci_enable_pasid(struct pci_dev *pdev, int features) if (WARN_ON(pdev->pasid_enabled)) return -EBUSY; - if (!pdev->eetlp_prefix_path) - return -EINVAL; - This check is useful, and follows the PCI specification (4.0r1.0 2.2.10.2 End-End TLP Prefix Processing: "Software should ensure that TLPs containing End-End TLP Prefixes are not sent to components that do not support them.") Thanks Jean, Why not set the eetlp_prefix_path bit from a PCI quirk? Unlike the stall problem from the other thread, this one looks like a simple design mistake that can be fixed easily in future iterations of the platform: just set the "End-End TLP Prefix Supported" bit in the Device Capability 2 Register of all bridges. Yes, we can still set eetlp_prefix_path bit from a PCI quirk. And we also have considered adding this bit in Device Capability 2 Register in future silicon. But we hesitated that it does reflect the real function: from register, it can support tlp, but in fact, it does not. Thanks
[RFC PATCH] PCI: Remove End-End TLP as PASID dependency
Some platform devices appear as PCI and have PCI cfg space, but are actually on the AMBA bus. They can support PASID via smmu stall feature, but does not support tlp since they are not real pci devices. So remove tlp as a PASID dependency. Signed-off-by: Zhangfei Gao --- drivers/pci/ats.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c index 390e92f..8e31278 100644 --- a/drivers/pci/ats.c +++ b/drivers/pci/ats.c @@ -344,9 +344,6 @@ int pci_enable_pasid(struct pci_dev *pdev, int features) if (WARN_ON(pdev->pasid_enabled)) return -EBUSY; - if (!pdev->eetlp_prefix_path) - return -EINVAL; - if (!pasid) return -EINVAL; -- 2.7.4
Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU
Hi, Bjorn On 2020/6/9 上午12:41, Bjorn Helgaas wrote: On Mon, Jun 08, 2020 at 10:54:15AM +0800, Zhangfei Gao wrote: On 2020/6/6 上午7:19, Bjorn Helgaas wrote: On Thu, Jun 04, 2020 at 09:33:07PM +0800, Zhangfei Gao wrote: On 2020/6/2 上午1:41, Bjorn Helgaas wrote: On Thu, May 28, 2020 at 09:33:44AM +0200, Joerg Roedel wrote: On Wed, May 27, 2020 at 01:18:42PM -0500, Bjorn Helgaas wrote: Is this slowdown significant? We already iterate over every device when applying PCI_FIXUP_FINAL quirks, so if we used the existing PCI_FIXUP_FINAL, we wouldn't be adding a new loop. We would only be adding two more iterations to the loop in pci_do_fixups() that tries to match quirks against the current device. I doubt that would be a measurable slowdown. I don't know how significant it is, but I remember people complaining about adding new PCI quirks because it takes too long for them to run them all. That was in the discussion about the quirk disabling ATS on AMD Stoney systems. So it probably depends on how many PCI devices are in the system whether it causes any measureable slowdown. I found this [1] from Paul Menzel, which was a slowdown caused by quirk_usb_early_handoff(). I think the real problem is individual quirks that take a long time. The PCI_FIXUP_IOMMU things we're talking about should be fast, and of course, they're only run for matching devices anyway. So I'd rather keep them as PCI_FIXUP_FINAL than add a whole new phase. Thanks Bjorn for taking time for this. If so, it would be much simpler. +++ b/drivers/iommu/iommu.c @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, fwspec->iommu_fwnode = iommu_fwnode; fwspec->ops = ops; dev_iommu_fwspec_set(dev, fwspec); + + if (dev_is_pci(dev)) + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); + Then pci_fixup_final will be called twice, the first in pci_bus_add_device. Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec. Will send this when 5.8-rc1 is open. Wait, this whole fixup approach seems wrong to me. No matter how you do the fixup, it's still a fixup, which means it requires ongoing maintenance. Surely we don't want to have to add the Vendor/Device ID for every new AMBA device that comes along, do we? Here the fake pci device has standard PCI cfg space, but physical implementation is base on AMBA They can provide pasid feature. However, 1, does not support tlp since they are not real pci devices. 2. does not support pri, instead support stall (provided by smmu) And stall is not a pci feature, so it is not described in struct pci_dev, but in struct iommu_fwspec. So we use this fixup to tell pci system that the devices can support stall, and hereby support pasid. This did not answer my question. Are you proposing that we update a quirk every time a new AMBA device is released? I don't think that would be a good model. Yes, you are right, but we do not have any better idea yet. Currently we have three fake pci devices, which support stall and pasid. We have to let pci system know the device can support pasid, because of stall feature, though not support pri. Do you have any other ideas? Thanks
Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU
Hi, Bjorn On 2020/6/6 上午7:19, Bjorn Helgaas wrote: On Thu, Jun 04, 2020 at 09:33:07PM +0800, Zhangfei Gao wrote: On 2020/6/2 上午1:41, Bjorn Helgaas wrote: On Thu, May 28, 2020 at 09:33:44AM +0200, Joerg Roedel wrote: On Wed, May 27, 2020 at 01:18:42PM -0500, Bjorn Helgaas wrote: Is this slowdown significant? We already iterate over every device when applying PCI_FIXUP_FINAL quirks, so if we used the existing PCI_FIXUP_FINAL, we wouldn't be adding a new loop. We would only be adding two more iterations to the loop in pci_do_fixups() that tries to match quirks against the current device. I doubt that would be a measurable slowdown. I don't know how significant it is, but I remember people complaining about adding new PCI quirks because it takes too long for them to run them all. That was in the discussion about the quirk disabling ATS on AMD Stoney systems. So it probably depends on how many PCI devices are in the system whether it causes any measureable slowdown. I found this [1] from Paul Menzel, which was a slowdown caused by quirk_usb_early_handoff(). I think the real problem is individual quirks that take a long time. The PCI_FIXUP_IOMMU things we're talking about should be fast, and of course, they're only run for matching devices anyway. So I'd rather keep them as PCI_FIXUP_FINAL than add a whole new phase. Thanks Bjorn for taking time for this. If so, it would be much simpler. +++ b/drivers/iommu/iommu.c @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, fwspec->iommu_fwnode = iommu_fwnode; fwspec->ops = ops; dev_iommu_fwspec_set(dev, fwspec); + + if (dev_is_pci(dev)) + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); + Then pci_fixup_final will be called twice, the first in pci_bus_add_device. Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec. Will send this when 5.8-rc1 is open. Wait, this whole fixup approach seems wrong to me. No matter how you do the fixup, it's still a fixup, which means it requires ongoing maintenance. Surely we don't want to have to add the Vendor/Device ID for every new AMBA device that comes along, do we? Here the fake pci device has standard PCI cfg space, but physical implementation is base on AMBA They can provide pasid feature. However, 1, does not support tlp since they are not real pci devices. 2. does not support pri, instead support stall (provided by smmu) And stall is not a pci feature, so it is not described in struct pci_dev, but in struct iommu_fwspec. So we use this fixup to tell pci system that the devices can support stall, and hereby support pasid. Thanks
Re: [PATCH] crypto: hisilicon - fix strncpy warning with strlcpy
On 2020/6/5 下午11:49, Eric Biggers wrote: On Fri, Jun 05, 2020 at 11:26:20PM +0800, Zhangfei Gao wrote: On 2020/6/5 下午8:17, Herbert Xu wrote: On Fri, Jun 05, 2020 at 05:34:32PM +0800, Zhangfei Gao wrote: Will add a check after the copy. strlcpy(interface.name, pdev->driver->name, sizeof(interface.name)); if (strlen(pdev->driver->name) != strlen(interface.name)) return -EINVAL; You don't need to do strlen. The function strlcpy returns the length of the source string. Better yet use strscpy which will even return an error for you. Yes, good idea, we can use strscpy. + int ret; - strncpy(interface.name, pdev->driver->name, sizeof(interface.name)); + ret = strscpy(interface.name, pdev->driver->name, sizeof(interface.name)); + if (ret < 0) + return ret; You might want to use -ENAMETOOLONG instead of the strscpy return value of -E2BIG. Yes, make sense, thanks Eric
Re: [PATCH] crypto: hisilicon - fix strncpy warning with strlcpy
On 2020/6/5 下午8:17, Herbert Xu wrote: On Fri, Jun 05, 2020 at 05:34:32PM +0800, Zhangfei Gao wrote: Will add a check after the copy. strlcpy(interface.name, pdev->driver->name, sizeof(interface.name)); if (strlen(pdev->driver->name) != strlen(interface.name)) return -EINVAL; You don't need to do strlen. The function strlcpy returns the length of the source string. Better yet use strscpy which will even return an error for you. Yes, good idea, we can use strscpy. + int ret; - strncpy(interface.name, pdev->driver->name, sizeof(interface.name)); + ret = strscpy(interface.name, pdev->driver->name, sizeof(interface.name)); + if (ret < 0) + return ret; Will resend later, thanks Herbert.
Re: [PATCH] crypto: hisilicon - fix strncpy warning with strlcpy
On 2020/6/4 下午2:50, Herbert Xu wrote: On Thu, Jun 04, 2020 at 02:44:16PM +0800, Zhangfei Gao wrote: I think it is fine. 1. Currently the name size is 64, bigger enough. Simply grep in driver name, 64 should be enough. We can make it larger when there is a request. 2. it does not matter what the name is, since it is just an interface. cat /sys/class/uacce/hisi_zip-0/flags cat /sys/class/uacce/his-0/flags should be both fine to app only they can be distinguished. 3. It maybe a hard restriction to fail just because of a long name. I think we should err on the side of caution. IOW, unless you know that you need it to succeed when it exceeds the limit, then you should just make it fail. Thanks Herbert Will add a check after the copy. strlcpy(interface.name, pdev->driver->name, sizeof(interface.name)); if (strlen(pdev->driver->name) != strlen(interface.name)) return -EINVAL; Will resend the fix after rc1 is open. Thanks
Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU
On 2020/6/2 上午1:41, Bjorn Helgaas wrote: On Thu, May 28, 2020 at 09:33:44AM +0200, Joerg Roedel wrote: On Wed, May 27, 2020 at 01:18:42PM -0500, Bjorn Helgaas wrote: Is this slowdown significant? We already iterate over every device when applying PCI_FIXUP_FINAL quirks, so if we used the existing PCI_FIXUP_FINAL, we wouldn't be adding a new loop. We would only be adding two more iterations to the loop in pci_do_fixups() that tries to match quirks against the current device. I doubt that would be a measurable slowdown. I don't know how significant it is, but I remember people complaining about adding new PCI quirks because it takes too long for them to run them all. That was in the discussion about the quirk disabling ATS on AMD Stoney systems. So it probably depends on how many PCI devices are in the system whether it causes any measureable slowdown. I found this [1] from Paul Menzel, which was a slowdown caused by quirk_usb_early_handoff(). I think the real problem is individual quirks that take a long time. The PCI_FIXUP_IOMMU things we're talking about should be fast, and of course, they're only run for matching devices anyway. So I'd rather keep them as PCI_FIXUP_FINAL than add a whole new phase. Thanks Bjorn for taking time for this. If so, it would be much simpler. +++ b/drivers/iommu/iommu.c @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, fwspec->iommu_fwnode = iommu_fwnode; fwspec->ops = ops; dev_iommu_fwspec_set(dev, fwspec); + + if (dev_is_pci(dev)) + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); + Then pci_fixup_final will be called twice, the first in pci_bus_add_device. Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec. Will send this when 5.8-rc1 is open. Thanks
Re: [PATCH] crypto: hisilicon - fix strncpy warning with strlcpy
On 2020/6/4 下午2:18, Herbert Xu wrote: On Thu, Jun 04, 2020 at 02:10:37PM +0800, Zhangfei Gao wrote: Should this even allow truncation? Perhaps it'd be better to fail in case of an overrun? I think we do not need consider overrun, since it at most copy size-1 bytes to dest. From the manual: strlcpy() This function is similar to strncpy(), but it copies at most size-1 bytes to dest, always adds a terminating null byte, And simple tested with smaller SIZE of interface.name, only SIZE-1 is copied, so it is safe. -#define UACCE_MAX_NAME_SIZE 64 +#define UACCE_MAX_NAME_SIZE 4 That's not what I meant. As it is if you do exceed the limit the name is silently truncated. Wouldn't it be better to fail the allocation instead? I think it is fine. 1. Currently the name size is 64, bigger enough. Simply grep in driver name, 64 should be enough. We can make it larger when there is a request. 2. it does not matter what the name is, since it is just an interface. cat /sys/class/uacce/hisi_zip-0/flags cat /sys/class/uacce/his-0/flags should be both fine to app only they can be distinguished. 3. It maybe a hard restriction to fail just because of a long name. What do you think. Thanks
Re: [PATCH] crypto: hisilicon - fix strncpy warning with strlcpy
On 2020/6/4 上午11:39, Herbert Xu wrote: On Thu, Jun 04, 2020 at 11:32:04AM +0800, Zhangfei Gao wrote: Use strlcpy to fix the warning warning: 'strncpy' specified bound 64 equals destination size [-Wstringop-truncation] Reported-by: kernel test robot Signed-off-by: Zhangfei Gao --- drivers/crypto/hisilicon/qm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index f795fb5..224f3e2 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -1574,7 +1574,7 @@ static int qm_alloc_uacce(struct hisi_qm *qm) .ops = &uacce_qm_ops, }; - strncpy(interface.name, pdev->driver->name, sizeof(interface.name)); + strlcpy(interface.name, pdev->driver->name, sizeof(interface.name)); Should this even allow truncation? Perhaps it'd be better to fail in case of an overrun? I think we do not need consider overrun, since it at most copy size-1 bytes to dest. From the manual: strlcpy() This function is similar to strncpy(), but it copies at most size-1 bytes to dest, always adds a terminating null byte, And simple tested with smaller SIZE of interface.name, only SIZE-1 is copied, so it is safe. -#define UACCE_MAX_NAME_SIZE 64 +#define UACCE_MAX_NAME_SIZE 4 Thanks
[PATCH] crypto: hisilicon - fix strncpy warning with strlcpy
Use strlcpy to fix the warning warning: 'strncpy' specified bound 64 equals destination size [-Wstringop-truncation] Reported-by: kernel test robot Signed-off-by: Zhangfei Gao --- drivers/crypto/hisilicon/qm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index f795fb5..224f3e2 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -1574,7 +1574,7 @@ static int qm_alloc_uacce(struct hisi_qm *qm) .ops = &uacce_qm_ops, }; - strncpy(interface.name, pdev->driver->name, sizeof(interface.name)); + strlcpy(interface.name, pdev->driver->name, sizeof(interface.name)); uacce = uacce_alloc(&pdev->dev, &interface); if (IS_ERR(uacce)) -- 2.7.4
Re: [PATCH 2/2] iommu: calling pci_fixup_iommu in iommu_fwspec_init
On 2020/5/27 下午5:01, Greg Kroah-Hartman wrote: On Tue, May 26, 2020 at 07:49:09PM +0800, Zhangfei Gao wrote: Calling pci_fixup_iommu in iommu_fwspec_init, which alloc iommu_fwnode. Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. So calling pci_fixup_iommu after iommu_fwnode is allocated. Signed-off-by: Zhangfei Gao --- drivers/iommu/iommu.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 7b37542..fb84c42 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, fwspec->iommu_fwnode = iommu_fwnode; fwspec->ops = ops; dev_iommu_fwspec_set(dev, fwspec); + + if (dev_is_pci(dev)) + pci_fixup_device(pci_fixup_iommu, to_pci_dev(dev)); Why can't the caller do this as it "knows" it is a PCI device at that point in time, right? Putting fixup here is because 1. iommu_fwspec has been allocated 2. iommu_fwspec_init will be called by of_pci_iommu_init and iort_pci_iommu_init, covering both acpi and dt Thanks
Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU
Hi, Bjorn On 2020/5/28 上午2:18, Bjorn Helgaas wrote: On Tue, May 26, 2020 at 07:49:07PM +0800, Zhangfei Gao wrote: Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. Here introducing PCI_FIXUP_IOMMU, which is called after iommu_fwnode is allocated, instead of reusing PCI_FIXUP_FINAL since it will slow down iommu probing as all devices in fixup final list will be reprocessed, suggested by Joerg, [1] Is this slowdown significant? We already iterate over every device when applying PCI_FIXUP_FINAL quirks, so if we used the existing PCI_FIXUP_FINAL, we wouldn't be adding a new loop. We would only be adding two more iterations to the loop in pci_do_fixups() that tries to match quirks against the current device. I doubt that would be a measurable slowdown. I do not notice the difference when compared fixup_iommu and fixup_final via get_jiffies_64, since in our platform no other pci fixup is registered. Here the plan is adding pci_fixup_device in iommu_fwspec_init, so if using fixup_final the iteration will be done again here. For example: Hisilicon platform device need fixup in drivers/pci/quirks.c handling fwspec->can_stall, which is introduced in [2] +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ +struct iommu_fwspec *fwspec; + +pdev->eetlp_prefix_path = 1; +fwspec = dev_iommu_fwspec_get(&pdev->dev); +if (fwspec) +fwspec->can_stall = 1; +} + +DECLARE_PCI_FIXUP_IOMMU(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_iFIXUP_IOMMU(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); [1] https://www.spinics.net/lists/iommu/msg44591.html [2] https://www.spinics.net/lists/linux-pci/msg94559.html If you reference these in the commit logs, please use lore.kernel.org links instead of spinics. Got it, thanks Bjorn.
Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU
On 2020/5/27 下午5:53, Arnd Bergmann wrote: On Wed, May 27, 2020 at 11:00 AM Greg Kroah-Hartman wrote: On Tue, May 26, 2020 at 07:49:07PM +0800, Zhangfei Gao wrote: Some platform devices appear as PCI but are actually on the AMBA bus, Why would these devices not just show up on the AMBA bus and use all of that logic instead of being a PCI device and having to go through odd fixes like this? There is a general move to having hardware be discoverable even with ARM processors. Having on-chip devices be discoverable using PCI config space is how x86 SoCs usually do it, and that is generally a good thing as it means we don't need to describe them in DT I guess as the hardware designers are still learning about it, this is not always done correctly. In general, we can also describe PCI devices on DT and do fixups during the probing there, but I suspect that won't work as easily using ACPI probing, so the fixup is keyed off the hardware ID, again as is common for x86 on-chip devices. Yes, thanks Arnd :) In order to use pasid, io page fault has to be supported, either by PCI PRI feature (from pci device) or stall mode from smmu (platform device). Here is letting system know the platform device can support smmu stall mode, as a result support pasid. While stall is not a pci capability, so we use a fixup here. Thanks
Re: [PATCH 1/2] PCI: Introduce PCI_FIXUP_IOMMU
Hi, Christoph On 2020/5/26 下午10:46, Christoph Hellwig wrote: On Tue, May 26, 2020 at 07:49:08PM +0800, Zhangfei Gao wrote: Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. Here introducing PCI_FIXUP_IOMMU, which is called after iommu_fwnode is allocated, instead of reusing PCI_FIXUP_FINAL since it will slow down iommu probing as all devices in fixup final list will be reprocessed. Who is going to use this? I don't see a single user in the series. We will add iommu fixup in drivers/pci/quirks.c, handling fwspec->can_stall, which is introduced in https://www.spinics.net/lists/linux-pci/msg94559.html Unfortunately, the patch does not catch v5.8, so we have to wait. And we want to check whether this is a right method to solve this issue. Thanks
Re: [PATCH 0/2] Let pci_fixup_final access iommu_fwnode
On 2020/5/25 下午9:43, Joerg Roedel wrote: On Tue, May 12, 2020 at 12:08:29PM +0800, Zhangfei Gao wrote: Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. So calling pci_fixup_final after iommu_fwnode is allocated. For example: Hisilicon platform device need fixup in drivers/pci/quirks.c +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ + struct iommu_fwspec *fwspec; + + pdev->eetlp_prefix_path = 1; + fwspec = dev_iommu_fwspec_get(&pdev->dev); + if (fwspec) + fwspec->can_stall = 1; +} + +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); I don't think it is a great idea to hook this into PCI_FIXUP_FINAL. The fixup list needs to be processed for every device, which will slow down probing. So either we introduce something like PCI_FIXUP_IOMMU, if this is entirely PCI specific. If it needs to be generic we need some fixup infrastructure in the IOMMU code itself. Thanks Joerg for the good suggestion. I am trying to introduce PCI_FIXUP_IOMMU in https://lkml.org/lkml/2020/5/26/366 Thanks
[PATCH 2/2] iommu: calling pci_fixup_iommu in iommu_fwspec_init
Calling pci_fixup_iommu in iommu_fwspec_init, which alloc iommu_fwnode. Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. So calling pci_fixup_iommu after iommu_fwnode is allocated. Signed-off-by: Zhangfei Gao --- drivers/iommu/iommu.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 7b37542..fb84c42 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, fwspec->iommu_fwnode = iommu_fwnode; fwspec->ops = ops; dev_iommu_fwspec_set(dev, fwspec); + + if (dev_is_pci(dev)) + pci_fixup_device(pci_fixup_iommu, to_pci_dev(dev)); + return 0; } EXPORT_SYMBOL_GPL(iommu_fwspec_init); -- 2.7.4
[PATCH 1/2] PCI: Introduce PCI_FIXUP_IOMMU
Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. Here introducing PCI_FIXUP_IOMMU, which is called after iommu_fwnode is allocated, instead of reusing PCI_FIXUP_FINAL since it will slow down iommu probing as all devices in fixup final list will be reprocessed. Suggested-by: Joerg Roedel Signed-off-by: Zhangfei Gao --- drivers/pci/quirks.c | 7 +++ include/asm-generic/vmlinux.lds.h | 3 +++ include/linux/pci.h | 8 3 files changed, 18 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index ca9ed57..b037034 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -83,6 +83,8 @@ extern struct pci_fixup __start_pci_fixups_header[]; extern struct pci_fixup __end_pci_fixups_header[]; extern struct pci_fixup __start_pci_fixups_final[]; extern struct pci_fixup __end_pci_fixups_final[]; +extern struct pci_fixup __start_pci_fixups_iommu[]; +extern struct pci_fixup __end_pci_fixups_iommu[]; extern struct pci_fixup __start_pci_fixups_enable[]; extern struct pci_fixup __end_pci_fixups_enable[]; extern struct pci_fixup __start_pci_fixups_resume[]; @@ -118,6 +120,11 @@ void pci_fixup_device(enum pci_fixup_pass pass, struct pci_dev *dev) end = __end_pci_fixups_final; break; + case pci_fixup_iommu: + start = __start_pci_fixups_iommu; + end = __end_pci_fixups_iommu; + break; + case pci_fixup_enable: start = __start_pci_fixups_enable; end = __end_pci_fixups_enable; diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 71e387a..3da32d8 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -411,6 +411,9 @@ __start_pci_fixups_final = .; \ KEEP(*(.pci_fixup_final)) \ __end_pci_fixups_final = .; \ + __start_pci_fixups_iommu = .; \ + KEEP(*(.pci_fixup_iommu)) \ + __end_pci_fixups_iommu = .; \ __start_pci_fixups_enable = .; \ KEEP(*(.pci_fixup_enable)) \ __end_pci_fixups_enable = .;\ diff --git a/include/linux/pci.h b/include/linux/pci.h index 83ce1cd..0d5fbf8 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1892,6 +1892,7 @@ enum pci_fixup_pass { pci_fixup_early,/* Before probing BARs */ pci_fixup_header, /* After reading configuration header */ pci_fixup_final,/* Final phase of device fixups */ + pci_fixup_iommu,/* After iommu_fwspec_init */ pci_fixup_enable, /* pci_enable_device() time */ pci_fixup_resume, /* pci_device_resume() */ pci_fixup_suspend, /* pci_device_suspend() */ @@ -1934,6 +1935,10 @@ enum pci_fixup_pass { class_shift, hook) \ DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \ hook, vendor, device, class, class_shift, hook) +#define DECLARE_PCI_FIXUP_CLASS_IOMMU(vendor, device, class, \ +class_shift, hook) \ + DECLARE_PCI_FIXUP_SECTION(.pci_fixup_iommu, \ + hook, vendor, device, class, class_shift, hook) #define DECLARE_PCI_FIXUP_CLASS_ENABLE(vendor, device, class, \ class_shift, hook) \ DECLARE_PCI_FIXUP_SECTION(.pci_fixup_enable,\ @@ -1964,6 +1969,9 @@ enum pci_fixup_pass { #define DECLARE_PCI_FIXUP_FINAL(vendor, device, hook) \ DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \ hook, vendor, device, PCI_ANY_ID, 0, hook) +#define DECLARE_PCI_FIXUP_IOMMU(vendor, device, hook) \ + DECLARE_PCI_FIXUP_SECTION(.pci_fixup_iommu, \ + hook, vendor, device, PCI_ANY_ID, 0, hook) #define DECLARE_PCI_FIXUP_ENABLE(vendor, device, hook) \ DECLARE_PCI_FIXUP_SECTION(.pci_fixup_enable,\ hook, vendor, device, PCI_ANY_ID, 0, hook) -- 2.7.4
[PATCH 0/2] Introduce PCI_FIXUP_IOMMU
Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. Here introducing PCI_FIXUP_IOMMU, which is called after iommu_fwnode is allocated, instead of reusing PCI_FIXUP_FINAL since it will slow down iommu probing as all devices in fixup final list will be reprocessed, suggested by Joerg, [1] For example: Hisilicon platform device need fixup in drivers/pci/quirks.c handling fwspec->can_stall, which is introduced in [2] +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ +struct iommu_fwspec *fwspec; + +pdev->eetlp_prefix_path = 1; +fwspec = dev_iommu_fwspec_get(&pdev->dev); +if (fwspec) +fwspec->can_stall = 1; +} + +DECLARE_PCI_FIXUP_IOMMU(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_iFIXUP_IOMMU(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); [1] https://www.spinics.net/lists/iommu/msg44591.html [2] https://www.spinics.net/lists/linux-pci/msg94559.html Zhangfei Gao (2): PCI: Introduce PCI_FIXUP_IOMMU iommu: calling pci_fixup_iommu in iommu_fwspec_init drivers/iommu/iommu.c | 4 drivers/pci/quirks.c | 7 +++ include/asm-generic/vmlinux.lds.h | 3 +++ include/linux/pci.h | 8 4 files changed, 22 insertions(+) -- 2.7.4
Re: [PATCH 0/2] Let pci_fixup_final access iommu_fwnode
Hi, Joerg On 2020/5/12 下午12:08, Zhangfei Gao wrote: Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. So calling pci_fixup_final after iommu_fwnode is allocated. For example: Hisilicon platform device need fixup in drivers/pci/quirks.c +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ + struct iommu_fwspec *fwspec; + + pdev->eetlp_prefix_path = 1; + fwspec = dev_iommu_fwspec_get(&pdev->dev); + if (fwspec) + fwspec->can_stall = 1; +} + +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); Zhangfei Gao (2): iommu/of: Let pci_fixup_final access iommu_fwnode ACPI/IORT: Let pci_fixup_final access iommu_fwnode drivers/acpi/arm64/iort.c | 1 + drivers/iommu/of_iommu.c | 1 + 2 files changed, 2 insertions(+) Would you mind give any suggestion? We need access fwspec->can_stall describing the platform device (a fake pcie) can support stall feature. can_stall will be used arm_smmu_add_device [1]. And stall is not a pci feature, so no such member in struct pci_dev. iommu_fwnode is allocated in iommu_fwspec_init, from of_pci_iommu_init or iort_pci_iommu_init. The pci_fixup_device(pci_fixup_final, dev) in pci_bus_add_device is too early that iommu_fwnode is not allocated. The pci_fixup_device(pci_fixup_enable, dev) in do_pci_enable_device is too late after arm_smmu_add_device. So the idea here is calling pci_fixup_device(pci_fixup_final) after of_pci_iommu_init and iort_pci_iommu_init, where iommu_fwnode is allocated. [1] https://www.spinics.net/lists/linux-pci/msg94559.html Thanks
[PATCH 0/2] Let pci_fixup_final access iommu_fwnode
Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. So calling pci_fixup_final after iommu_fwnode is allocated. For example: Hisilicon platform device need fixup in drivers/pci/quirks.c +static void quirk_huawei_pcie_sva(struct pci_dev *pdev) +{ + struct iommu_fwspec *fwspec; + + pdev->eetlp_prefix_path = 1; + fwspec = dev_iommu_fwspec_get(&pdev->dev); + if (fwspec) + fwspec->can_stall = 1; +} + +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); Zhangfei Gao (2): iommu/of: Let pci_fixup_final access iommu_fwnode ACPI/IORT: Let pci_fixup_final access iommu_fwnode drivers/acpi/arm64/iort.c | 1 + drivers/iommu/of_iommu.c | 1 + 2 files changed, 2 insertions(+) -- 2.7.4
[PATCH 1/2] iommu/of: Let pci_fixup_final access iommu_fwnode
Calling pci_fixup_final after of_pci_iommu_init, which alloc iommu_fwnode. Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. So calling pci_fixup_final after iommu_fwnode is allocated. Signed-off-by: Zhangfei Gao --- drivers/iommu/of_iommu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c index 20738aac..c1b58c4 100644 --- a/drivers/iommu/of_iommu.c +++ b/drivers/iommu/of_iommu.c @@ -188,6 +188,7 @@ const struct iommu_ops *of_iommu_configure(struct device *dev, pci_request_acs(); err = pci_for_each_dma_alias(to_pci_dev(dev), of_pci_iommu_init, &info); + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); } else if (dev_is_fsl_mc(dev)) { err = of_fsl_mc_iommu_init(to_fsl_mc_device(dev), master_np); } else { -- 2.7.4
[PATCH 2/2] ACPI/IORT: Let pci_fixup_final access iommu_fwnode
Calling pci_fixup_final after iommu_fwspec_init, which alloc iommu_fwnode. Some platform devices appear as PCI but are actually on the AMBA bus, and they need fixup in drivers/pci/quirks.c handling iommu_fwnode. So calling pci_fixup_final after iommu_fwnode is allocated. Signed-off-by: Zhangfei Gao --- drivers/acpi/arm64/iort.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index 7d04424..02e361d 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -1027,6 +1027,7 @@ const struct iommu_ops *iort_iommu_configure(struct device *dev) info.node = node; err = pci_for_each_dma_alias(to_pci_dev(dev), iort_pci_iommu_init, &info); + pci_fixup_device(pci_fixup_final, to_pci_dev(dev)); fwspec = dev_iommu_fwspec_get(dev); if (fwspec && iort_pci_rc_supports_ats(node)) -- 2.7.4
Re: [PATCH v5 2/3] uacce: add uacce driver
On 2019/10/16 上午1:55, reg Kroah-Hartman wrote: On Tue, Oct 15, 2019 at 03:39:00PM +0800, zhangfei wrote: Hi, Jonathan On 2019/10/14 下午6:32, Jonathan Cameron wrote: On Mon, 14 Oct 2019 14:48:54 +0800 Zhangfei Gao wrote: From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Since unified address, hardware and user space of process can share the same virtual address in the communication. Uacce create a chrdev for every registration, the queue is allocated to the process when the chrdev is opened. Then the process can access the hardware resource by interact with the queue file. By mmap the queue file space to user space, the process can directly put requests to the hardware without syscall to the kernel space. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao Hi, Some superficial comments from me. Thanks for the suggestion. +/* + * While user space releases a queue, all the relatives on the queue + * should be released immediately by this putting. This one needs rewording but I'm not quite sure what relatives are in this case. + */ +static long uacce_put_queue(struct uacce_queue *q) +{ + struct uacce_device *uacce = q->uacce; + + mutex_lock(&uacce_mutex); + + if ((q->state == UACCE_Q_STARTED) && uacce->ops->stop_queue) + uacce->ops->stop_queue(q); + + if ((q->state == UACCE_Q_INIT || q->state == UACCE_Q_STARTED) && +uacce->ops->put_queue) + uacce->ops->put_queue(q); + + q->state = UACCE_Q_ZOMBIE; + mutex_unlock(&uacce_mutex); + + return 0; +} + .. + +static ssize_t qfrs_size_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct uacce_device *uacce = to_uacce_device(dev); + unsigned long size; + int i, ret; + + for (i = 0, ret = 0; i < UACCE_QFRT_MAX; i++) { + size = uacce->qf_pg_size[i] << PAGE_SHIFT; + if (i == UACCE_QFRT_SS) + break; + ret += sprintf(buf + ret, "%lu\t", size); + } + ret += sprintf(buf + ret, "%lu\n", size); + + return ret; +} This may break the sysfs rule of one thing per file. If you have multiple regions, they should probably each have their own file to give their size. Is the rule must be applied? Yes, it always must be followed. Please fix. OK, understand. Have updated in v6. Thanks for the patience.
[PATCH v6 1/3] uacce: Add documents for uacce
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) is a kernel module targets to provide Shared Virtual Addressing (SVA) between the accelerator and process. This patch add document to explain how it works. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/misc-devices/uacce.rst | 297 +++ 1 file changed, 297 insertions(+) create mode 100644 Documentation/misc-devices/uacce.rst diff --git a/Documentation/misc-devices/uacce.rst b/Documentation/misc-devices/uacce.rst new file mode 100644 index 000..05c1e09 --- /dev/null +++ b/Documentation/misc-devices/uacce.rst @@ -0,0 +1,297 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Introduction of Uacce += + +Uacce (Unified/User-space-access-intended Accelerator Framework) targets to +provide Shared Virtual Addressing (SVA) between accelerators and processes. +So accelerator can access any data structure of the main cpu. +This differs from the data sharing between cpu and io device, which share +data content rather than address. +Because of the unified address, hardware and user space of process can +share the same virtual address in the communication. +Uacce takes the hardware accelerator as a heterogeneous processor, while +IOMMU share the same CPU page tables and as a result the same translation +from va to pa. + +__ __ + | | | | + | User application (CPU) | | Hardware Accelerator | + |__| |__| + +| | +| va | va +V V + ____ +| | | | +| MMU| | IOMMU | +|__| |__| +| | +| | +V pa V pa +___ + | | + | Memory | + |___| + + + +Architecture + + +Uacce is the kernel module, taking charge of iommu and address sharing. +The user drivers and libraries are called WarpDrive. + +A virtual concept, queue, is used for the communication. It provides a +FIFO-like interface. And it maintains a unified address space between the +application and all involved hardware. + + ___ +| | user API | | +| WarpDrive library | > | user driver | +|___| || + || + || + | queue fd | + || + || + v| + ___ _| +| | | | | mmap memory +| Other framework | | uacce | | r/w interface +| crypto/nic/others | |_| | +|___| | + | || + | register | register | + | || + | || + |_ __ | + | | | | | | + - | Device Driver | | IOMMU | | + |_| |__| | + || + |
[PATCH v6 0/3] Add uacce module for Accelerator
Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Because of unified address, hardware and user space of process can share the same virtual address in the communication. Uacce is intended to be used with Jean Philippe Brucker's SVA patchset[1], which enables IO side page fault and PASID support. We have keep verifying with Jean's sva/current [2] We also keep verifying with Eric's SMMUv3 Nested Stage patch [3] This series and related zip & qm driver https://github.com/Linaro/linux-kernel-warpdrive/tree/5.4-rc1-uacce-v6 The library and user application: https://github.com/Linaro/warpdrive/tree/wdprd-v1-upstream References: [1] http://jpbrucker.net/sva/ [2] http://www.linux-arm.org/git?p=linux-jpb.git;a=shortlog;h=refs/heads/sva/current [3] https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9 Change History: v6: Change sys qfrs_size to different file, suggested by Jonathan Fix crypto daily build issue and based on crypto code base, also 5.4-rc1. v5: Add an example patch using the uacce interface, suggested by Greg 0003-crypto-hisilicon-register-zip-engine-to-uacce.patch v4: Based on 5.4-rc1 Considering other driver integrating uacce, if uacce not compiled, uacce_register return error and uacce_unregister is empty. Simplify uacce flag: UACCE_DEV_SVA. Address Greg's comments: Fix state machine, remove potential syslog triggered from user space etc. v3: Recommended by Greg, use sturct uacce_device instead of struct uacce, and use struct *cdev in struct uacce_device, as a result, cdev can be released by itself when refcount decreased to 0. So the two structures are decoupled and self-maintained by themsleves. Also add dev.release for put_device. v2: Address comments from Greg and Jonathan Modify interface uacce_register Drop noiommu mode first v1: 1. Rebase to 5.3-rc1 2. Build on iommu interface 3. Verifying with Jean's sva and Eric's nested mode iommu. 4. User library has developed a lot: support zlib, openssl etc. 5. Move to misc first RFC3: https://lkml.org/lkml/2018/11/12/1951 RFC2: https://lwn.net/Articles/763990/ Background of why Uacce: Von Neumann processor is not good at general data manipulation. It is designed for control-bound rather than data-bound application. The latter need less control path facility and more/specific ALUs. So there are more and more heterogeneous processors, such as encryption/decryption accelerators, TPUs, or EDGE (Explicated Data Graph Execution) processors, introduced to gain better performance or power efficiency for particular applications these days. There are generally two ways to make use of these heterogeneous processors: The first is to make them co-processors, just like FPU. This is good for some application but it has its own cons: It changes the ISA set permanently. You must save all state elements when the process is switched out. But most data-bound processors have a huge set of state elements. It makes the kernel scheduler more complex. The second is Accelerator. It is taken as a IO device from the CPU's point of view (but it need not to be physically). The process, running on CPU, hold a context of the accelerator and send instructions to it as if it calls a function or thread running with FPU. The context is bound with the processor itself. So the state elements remain in the hardware context until the context is released. We believe this is the core feature of an "Accelerator" vs. Co-processor or other heterogeneous processors. The intention of Uacce is to provide the basic facility to backup this scenario. Its first step is to make sure the accelerator and process can share the same address space. So the accelerator ISA can directly address any data structure of the main CPU. This differs from the data sharing between CPU and IO device, which share data content rather than address. So it is different comparing to the other DMA libraries. In the future, we may add more facility to support linking accelerator library to the main application, or managing the accelerator context as special thread. But no matter how, this can be a solid start point for new processor to be used as an "accelerator" as this is the essential requirement. Kenneth Lee (2): uacce: Add documents for uacce uacce: add uacce driver Zhangfei Gao (1): crypto: hisilicon - register zip engine to uacce Documentation/ABI/testing/sysfs-driver-uacce | 65 ++ Documentation/misc-devices/uacce.rst | 297 drivers/crypto/hisilicon/qm.c| 254 ++- drivers/crypto/hisilicon/qm.h| 13 +- drivers/crypto/hisilicon/zip/zip_main.c | 39 +- drivers/misc/Kconfig
Re: [PATCH v5 2/3] uacce: add uacce driver
Hi, Jonathan On 2019/10/14 下午6:32, Jonathan Cameron wrote: On Mon, 14 Oct 2019 14:48:54 +0800 Zhangfei Gao wrote: From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Since unified address, hardware and user space of process can share the same virtual address in the communication. Uacce create a chrdev for every registration, the queue is allocated to the process when the chrdev is opened. Then the process can access the hardware resource by interact with the queue file. By mmap the queue file space to user space, the process can directly put requests to the hardware without syscall to the kernel space. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao Hi, Some superficial comments from me. Thanks for the suggestion. +/* + * While user space releases a queue, all the relatives on the queue + * should be released immediately by this putting. This one needs rewording but I'm not quite sure what relatives are in this case. + */ +static long uacce_put_queue(struct uacce_queue *q) +{ + struct uacce_device *uacce = q->uacce; + + mutex_lock(&uacce_mutex); + + if ((q->state == UACCE_Q_STARTED) && uacce->ops->stop_queue) + uacce->ops->stop_queue(q); + + if ((q->state == UACCE_Q_INIT || q->state == UACCE_Q_STARTED) && +uacce->ops->put_queue) + uacce->ops->put_queue(q); + + q->state = UACCE_Q_ZOMBIE; + mutex_unlock(&uacce_mutex); + + return 0; +} + .. + +static ssize_t qfrs_size_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct uacce_device *uacce = to_uacce_device(dev); + unsigned long size; + int i, ret; + + for (i = 0, ret = 0; i < UACCE_QFRT_MAX; i++) { + size = uacce->qf_pg_size[i] << PAGE_SHIFT; + if (i == UACCE_QFRT_SS) + break; + ret += sprintf(buf + ret, "%lu\t", size); + } + ret += sprintf(buf + ret, "%lu\n", size); + + return ret; +} This may break the sysfs rule of one thing per file. If you have multiple regions, they should probably each have their own file to give their size. Is the rule must be applied? Documentation/filesystems/sysfs.txt Attributes should be ASCII text files, preferably with only one value per file. It is noted that it may not be efficient to contain only one value per file, so it is socially acceptable to express an array of values of the same type. We may have efficiency here. For extensibility in future, UACCE_QFRT_MAX=16, do we export all of them? In sva case only 2 region is used, and 4 at most if sva is not supported. Do you think it is more efficient to put in one file? + +/** + * uacce_unregister - unregisters a uacce + * @uacce: the accelerator to unregister + * + * Unregister an accelerator that wat previously successully registered with wat -> was successully -> successfully ... +/** + * struct uacce_queue + * @uacce: pointer to uacce + * @priv: private pointer + * @wait: wait queue head + * @pasid: pasid of the queue + * @handle: iommu_sva handle return from iommu_sva_bind_device + * @list: share list for qfr->qs + * @mm: current->mm + * @qfrs: pointer of qfr regions Missing state. Make sure to run ./scripts/kernel-doc FILENAME > /dev/null and fix any errors that show up. Good tool, will use it to check. Thanks.
[PATCH v5 3/3] crypto: hisilicon - register zip engine to uacce
qm using uacce as an example, will resubmit after uacce is merged. Signed-off-by: Zhangfei Gao Signed-off-by: Zhou Wang --- drivers/crypto/hisilicon/qm.c | 254 ++-- drivers/crypto/hisilicon/qm.h | 13 +- drivers/crypto/hisilicon/zip/zip_main.c | 39 ++--- include/uapi/misc/uacce/qm.h| 15 ++ 4 files changed, 285 insertions(+), 36 deletions(-) create mode 100644 include/uapi/misc/uacce/qm.h diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index f975c39..60067d8 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -9,6 +9,9 @@ #include #include #include +#include +#include +#include #include "qm.h" /* eq/aeq irq enable */ @@ -459,17 +462,22 @@ static void qm_cq_head_update(struct hisi_qp *qp) static void qm_poll_qp(struct hisi_qp *qp, struct hisi_qm *qm) { - struct qm_cqe *cqe = qp->cqe + qp->qp_status.cq_head; - - if (qp->req_cb) { - while (QM_CQE_PHASE(cqe) == qp->qp_status.cqc_phase) { - dma_rmb(); - qp->req_cb(qp, qp->sqe + qm->sqe_size * cqe->sq_head); - qm_cq_head_update(qp); - cqe = qp->cqe + qp->qp_status.cq_head; - qm_db(qm, qp->qp_id, QM_DOORBELL_CMD_CQ, - qp->qp_status.cq_head, 0); - atomic_dec(&qp->qp_status.used); + struct qm_cqe *cqe; + + if (qp->event_cb) { + qp->event_cb(qp); + } else { + cqe = qp->cqe + qp->qp_status.cq_head; + + if (qp->req_cb) { + while (QM_CQE_PHASE(cqe) == qp->qp_status.cqc_phase) { + dma_rmb(); + qp->req_cb(qp, qp->sqe + qm->sqe_size * + cqe->sq_head); + qm_cq_head_update(qp); + cqe = qp->cqe + qp->qp_status.cq_head; + atomic_dec(&qp->qp_status.used); + } } /* set c_flag */ @@ -1391,6 +1399,221 @@ static void hisi_qm_cache_wb(struct hisi_qm *qm) } } +static void qm_qp_event_notifier(struct hisi_qp *qp) +{ + wake_up_interruptible(&qp->uacce_q->wait); +} + +static int hisi_qm_get_available_instances(struct uacce_device *uacce) +{ + int i, ret; + struct hisi_qm *qm = uacce->priv; + + read_lock(&qm->qps_lock); + for (i = 0, ret = 0; i < qm->qp_num; i++) + if (!qm->qp_array[i]) + ret++; + read_unlock(&qm->qps_lock); + + return ret; +} + +static int hisi_qm_uacce_get_queue(struct uacce_device *uacce, + unsigned long arg, + struct uacce_queue *q) +{ + struct hisi_qm *qm = uacce->priv; + struct hisi_qp *qp; + u8 alg_type = 0; + + qp = hisi_qm_create_qp(qm, alg_type); + if (IS_ERR(qp)) + return PTR_ERR(qp); + + q->priv = qp; + q->uacce = uacce; + qp->uacce_q = q; + qp->event_cb = qm_qp_event_notifier; + qp->pasid = arg; + + return 0; +} + +static void hisi_qm_uacce_put_queue(struct uacce_queue *q) +{ + struct hisi_qp *qp = q->priv; + + /* +* As put_queue is only called in uacce_mode=1, and only one queue can +* be used in this mode. we flush all sqc cache back in put queue. +*/ + hisi_qm_cache_wb(qp->qm); + + /* need to stop hardware, but can not support in v1 */ + hisi_qm_release_qp(qp); +} + +/* map sq/cq/doorbell to user space */ +static int hisi_qm_uacce_mmap(struct uacce_queue *q, + struct vm_area_struct *vma, + struct uacce_qfile_region *qfr) +{ + struct hisi_qp *qp = q->priv; + struct hisi_qm *qm = qp->qm; + size_t sz = vma->vm_end - vma->vm_start; + struct pci_dev *pdev = qm->pdev; + struct device *dev = &pdev->dev; + unsigned long vm_pgoff; + int ret; + + switch (qfr->type) { + case UACCE_QFRT_MMIO: + if (qm->ver == QM_HW_V2) { + if (sz > PAGE_SIZE * (QM_DOORBELL_PAGE_NR + + QM_DOORBELL_SQ_CQ_BASE_V2 / PAGE_SIZE)) + return -EINVAL; + } else { + if (sz > PAGE_SIZE * QM_DOORBELL_PAGE_NR) + return -EINVAL; + } + + vma->vm_flags |= VM_IO; + + return remap_pfn_range(vma, vma->vm_start, + qm->phys_base >&
[PATCH v5 2/3] uacce: add uacce driver
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Since unified address, hardware and user space of process can share the same virtual address in the communication. Uacce create a chrdev for every registration, the queue is allocated to the process when the chrdev is opened. Then the process can access the hardware resource by interact with the queue file. By mmap the queue file space to user space, the process can directly put requests to the hardware without syscall to the kernel space. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ drivers/misc/Kconfig | 1 + drivers/misc/Makefile| 1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile | 2 + drivers/misc/uacce/uacce.c | 974 +++ include/linux/uacce.h| 167 + include/uapi/misc/uacce/uacce.h | 34 + 8 files changed, 1239 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce/uacce.h diff --git a/Documentation/ABI/testing/sysfs-driver-uacce b/Documentation/ABI/testing/sysfs-driver-uacce new file mode 100644 index 000..b1a2c60 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-uacce @@ -0,0 +1,47 @@ +What: /sys/class/uacce/hisi_zip-/id +Date: Oct 2019 +KernelVersion: 5.5 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the device. + +What: /sys/class/uacce/hisi_zip-/api +Date: Oct 2019 +KernelVersion: 5.5 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Api of the device, used by application to match the correct driver + +What: /sys/class/uacce/hisi_zip-/flags +Date: Oct 2019 +KernelVersion: 5.5 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h + +What: /sys/class/uacce/hisi_zip-/available_instances +Date: Oct 2019 +KernelVersion: 5.5 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Available instances left of the device + +What: /sys/class/uacce/hisi_zip-/algorithms +Date: Oct 2019 +KernelVersion: 5.5 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Algorithms supported by this accelerator + +What: /sys/class/uacce/hisi_zip-/qfrs_size +Date: Oct 2019 +KernelVersion: 5.5 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Page size of each queue file regions + +What: /sys/class/uacce/hisi_zip-/numa_distance +Date: Oct 2019 +KernelVersion: 5.5 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Distance of device node to cpu node + +What: /sys/class/uacce/hisi_zip-/node_id +Date: Oct 2019 +KernelVersion: 5.5 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the numa node diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index c55b637..929feb0 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -481,4 +481,5 @@ source "drivers/misc/cxl/Kconfig" source "drivers/misc/ocxl/Kconfig" source "drivers/misc/cardreader/Kconfig" source "drivers/misc/habanalabs/Kconfig" +source "drivers/misc/uacce/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index c1860d3..9abf292 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -56,4 +56,5 @@ obj-$(CONFIG_OCXL)+= ocxl/ obj-y += cardreader/ obj-$(CONFIG_PVPANIC) += pvpanic.o obj-$(CONFIG_HABANA_AI)+= habanalabs/ +obj-$(CONFIG_UACCE)+= uacce/ obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o diff --git a/drivers/misc/uacce/Kconfig b/drivers/misc/uacce/Kconfig new file mode 100644 index 000..e854354 --- /dev/null +++ b/drivers/misc/uacce/Kconfig @@ -0,0 +1,13 @@ +config UACCE + tristate "Accelerator Framework for User Land" + depends on IOMMU_API + help + UACCE provides interface for the user process to access the hardware + without interaction with the kernel space in data path. + + The user-space interface is described in +
[PATCH v5 0/3] Add uacce module for Accelerator
Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Because of unified address, hardware and user space of process can share the same virtual address in the communication. Uacce is intended to be used with Jean Philippe Brucker's SVA patchset[1], which enables IO side page fault and PASID support. We have keep verifying with Jean's sva/current [2] We also keep verifying with Eric's SMMUv3 Nested Stage patch [3] This series and related zip & qm driver https://github.com/Linaro/linux-kernel-warpdrive/tree/5.4-rc1-uacce-v5 The library and user application: https://github.com/Linaro/warpdrive/tree/wdprd-v1-upstream References: [1] http://jpbrucker.net/sva/ [2] http://www.linux-arm.org/git?p=linux-jpb.git;a=shortlog;h=refs/heads/sva/current [3] https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9 Change History: v5: Add an example patch using the uacce interface, suggested by Greg 0003-crypto-hisilicon-register-zip-engine-to-uacce.patch v4: Based on 5.4-rc1 Considering other driver integrating uacce, if uacce not compiled, uacce_register return error and uacce_unregister is empty. Simplify uacce flag: UACCE_DEV_SVA. Address Greg's comments: Fix state machine, remove potential syslog triggered from user space etc. v3: Recommended by Greg, use sturct uacce_device instead of struct uacce, and use struct *cdev in struct uacce_device, as a result, cdev can be released by itself when refcount decreased to 0. So the two structures are decoupled and self-maintained by themsleves. Also add dev.release for put_device. v2: Address comments from Greg and Jonathan Modify interface uacce_register Drop noiommu mode first v1: 1. Rebase to 5.3-rc1 2. Build on iommu interface 3. Verifying with Jean's sva and Eric's nested mode iommu. 4. User library has developed a lot: support zlib, openssl etc. 5. Move to misc first RFC3: https://lkml.org/lkml/2018/11/12/1951 RFC2: https://lwn.net/Articles/763990/ Background of why Uacce: Von Neumann processor is not good at general data manipulation. It is designed for control-bound rather than data-bound application. The latter need less control path facility and more/specific ALUs. So there are more and more heterogeneous processors, such as encryption/decryption accelerators, TPUs, or EDGE (Explicated Data Graph Execution) processors, introduced to gain better performance or power efficiency for particular applications these days. There are generally two ways to make use of these heterogeneous processors: The first is to make them co-processors, just like FPU. This is good for some application but it has its own cons: It changes the ISA set permanently. You must save all state elements when the process is switched out. But most data-bound processors have a huge set of state elements. It makes the kernel scheduler more complex. The second is Accelerator. It is taken as a IO device from the CPU's point of view (but it need not to be physically). The process, running on CPU, hold a context of the accelerator and send instructions to it as if it calls a function or thread running with FPU. The context is bound with the processor itself. So the state elements remain in the hardware context until the context is released. We believe this is the core feature of an "Accelerator" vs. Co-processor or other heterogeneous processors. The intention of Uacce is to provide the basic facility to backup this scenario. Its first step is to make sure the accelerator and process can share the same address space. So the accelerator ISA can directly address any data structure of the main CPU. This differs from the data sharing between CPU and IO device, which share data content rather than address. So it is different comparing to the other DMA libraries. In the future, we may add more facility to support linking accelerator library to the main application, or managing the accelerator context as special thread. But no matter how, this can be a solid start point for new processor to be used as an "accelerator" as this is the essential requirement. Kenneth Lee (2): uacce: Add documents for uacce uacce: add uacce driver Zhangfei Gao (1): crypto: hisilicon - register zip engine to uacce Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ Documentation/misc-devices/uacce.rst | 297 drivers/crypto/hisilicon/qm.c| 259 ++- drivers/crypto/hisilicon/qm.h| 13 +- drivers/crypto/hisilicon/zip/zip_main.c | 39 +- drivers/misc/Kconfig | 1 + drivers/misc/Makefile| 1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Ma
[PATCH v5 1/3] uacce: Add documents for uacce
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) is a kernel module targets to provide Shared Virtual Addressing (SVA) between the accelerator and process. This patch add document to explain how it works. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/misc-devices/uacce.rst | 297 +++ 1 file changed, 297 insertions(+) create mode 100644 Documentation/misc-devices/uacce.rst diff --git a/Documentation/misc-devices/uacce.rst b/Documentation/misc-devices/uacce.rst new file mode 100644 index 000..1ddf4ff --- /dev/null +++ b/Documentation/misc-devices/uacce.rst @@ -0,0 +1,297 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Introduction of Uacce += + +Uacce (Unified/User-space-access-intended Accelerator Framework) targets to +provide Shared Virtual Addressing (SVA) between accelerators and processes. +So accelerator can access any data structure of the main cpu. +This differs from the data sharing between cpu and io device, which share +data content rather than address. +Because of the unified address, hardware and user space of process can +share the same virtual address in the communication. +Uacce takes the hardware accelerator as a heterogeneous processor, while +IOMMU share the same CPU page tables and as a result the same translation +from va to pa. + +__ __ + | | | | + | User application (CPU) | | Hardware Accelerator | + |__| |__| + +| | +| va | va +V V + ____ +| | | | +| MMU| | IOMMU | +|__| |__| +| | +| | +V pa V pa +___ + | | + | Memory | + |___| + + + +Architecture + + +Uacce is the kernel module, taking charge of iommu and address sharing. +The user drivers and libraries are called WarpDrive. + +A virtual concept, queue, is used for the communication. It provides a +FIFO-like interface. And it maintains a unified address space between the +application and all involved hardware. + + ___ +| | user API | | +| WarpDrive library | > | user driver | +|___| || + || + || + | queue fd | + || + || + v| + ___ _| +| | | | | mmap memory +| Other framework | | uacce | | r/w interface +| crypto/nic/others | |_| | +|___| | + | || + | register | register | + | || + | || + |_ __ | + | | | | | | + - | Device Driver | | IOMMU | | + |_| |__| | + || + |
[RESEND PATCH v4 2/2] uacce: add uacce driver
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Since unified address, hardware and user space of process can share the same virtual address in the communication. Uacce create a chrdev for every registration, the queue is allocated to the process when the chrdev is opened. Then the process can access the hardware resource by interact with the queue file. By mmap the queue file space to user space, the process can directly put requests to the hardware without syscall to the kernel space. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1013 ++ include/linux/uacce.h| 167 + include/uapi/misc/uacce.h| 36 + 8 files changed, 1280 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce.h diff --git a/Documentation/ABI/testing/sysfs-driver-uacce b/Documentation/ABI/testing/sysfs-driver-uacce new file mode 100644 index 000..b7ff6af2 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-uacce @@ -0,0 +1,47 @@ +What: /sys/class/uacce/hisi_zip-/id +Date: Oct 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the device. + +What: /sys/class/uacce/hisi_zip-/api +Date: Oct 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Api of the device, used by application to match the correct driver + +What: /sys/class/uacce/hisi_zip-/flags +Date: Oct 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h + +What: /sys/class/uacce/hisi_zip-/available_instances +Date: Oct 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Available instances left of the device + +What: /sys/class/uacce/hisi_zip-/algorithms +Date: Oct 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Algorithms supported by this accelerator + +What: /sys/class/uacce/hisi_zip-/qfrs_offset +Date: Oct 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Page offsets of each queue file regions + +What: /sys/class/uacce/hisi_zip-/numa_distance +Date: Oct 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Distance of device node to cpu node + +What: /sys/class/uacce/hisi_zip-/node_id +Date: Oct 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the numa node diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index c55b637..929feb0 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -481,4 +481,5 @@ source "drivers/misc/cxl/Kconfig" source "drivers/misc/ocxl/Kconfig" source "drivers/misc/cardreader/Kconfig" source "drivers/misc/habanalabs/Kconfig" +source "drivers/misc/uacce/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index c1860d3..9abf292 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -56,4 +56,5 @@ obj-$(CONFIG_OCXL)+= ocxl/ obj-y += cardreader/ obj-$(CONFIG_PVPANIC) += pvpanic.o obj-$(CONFIG_HABANA_AI)+= habanalabs/ +obj-$(CONFIG_UACCE)+= uacce/ obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o diff --git a/drivers/misc/uacce/Kconfig b/drivers/misc/uacce/Kconfig new file mode 100644 index 000..e854354 --- /dev/null +++ b/drivers/misc/uacce/Kconfig @@ -0,0 +1,13 @@ +config UACCE + tristate "Accelerator Framework for User Land" + depends on IOMMU_API + help + UACCE provides interface for the user process to access the hardware + without interaction with the kernel space in data path. + + The user-space interface is desc
[RESEND PATCH v4 1/2] uacce: Add documents for uacce
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) is a kernel module targets to provide Shared Virtual Addressing (SVA) between the accelerator and process. This patch add document to explain how it works. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/misc-devices/uacce.rst | 297 +++ 1 file changed, 297 insertions(+) create mode 100644 Documentation/misc-devices/uacce.rst diff --git a/Documentation/misc-devices/uacce.rst b/Documentation/misc-devices/uacce.rst new file mode 100644 index 000..b3cf0d5 --- /dev/null +++ b/Documentation/misc-devices/uacce.rst @@ -0,0 +1,297 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Introduction of Uacce += + +Uacce (Unified/User-space-access-intended Accelerator Framework) targets to +provide Shared Virtual Addressing (SVA) between accelerators and processes. +So accelerator can access any data structure of the main cpu. +This differs from the data sharing between cpu and io device, which share +data content rather than address. +Because of the unified address, hardware and user space of process can +share the same virtual address in the communication. +Uacce takes the hardware accelerator as a heterogeneous processor, while +IOMMU share the same CPU page tables and as a result the same translation +from va to pa. + +__ __ + | | | | + | User application (CPU) | | Hardware Accelerator | + |__| |__| + +| | +| va | va +V V + ____ +| | | | +| MMU| | IOMMU | +|__| |__| +| | +| | +V pa V pa +___ + | | + | Memory | + |___| + + + +Architecture + + +Uacce is the kernel module, taking charge of iommu and address sharing. +The user drivers and libraries are called WarpDrive. + +A virtual concept, queue, is used for the communication. It provides a +FIFO-like interface. And it maintains a unified address space between the +application and all involved hardware. + + ___ +| | user API | | +| WarpDrive library | > | user driver | +|___| || + || + || + | queue fd | + || + || + v| + ___ _| +| | | | | mmap memory +| Other framework | | uacce | | r/w interface +| crypto/nic/others | |_| | +|___| | + | || + | register | register | + | || + | || + |_ __ | + | | | | | | + - | Device Driver | | IOMMU | | + |_| |__| | + || + |
[RESEND PATCH v4 0/2] Add uacce module for Accelerator
Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Because of unified address, hardware and user space of process can share the same virtual address in the communication. Uacce is intended to be used with Jean Philippe Brucker's SVA patchset[1], which enables IO side page fault and PASID support. We have keep verifying with Jean's sva/current [2] We also keep verifying with Eric's SMMUv3 Nested Stage patch [3] This series and related zip & qm driver https://github.com/Linaro/linux-kernel-warpdrive/tree/5.4-rc1-uacce-v4 The library and user application: https://github.com/Linaro/warpdrive/tree/wdprd-v1-upstream References: [1] http://jpbrucker.net/sva/ [2] http://www.linux-arm.org/git?p=linux-jpb.git;a=shortlog;h=refs/heads/sva/current [3] https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9 Change History: v4: Based on 5.4-rc1 Considering other driver integrating uacce, if uacce not compiled, uacce_register return error and uacce_unregister is empty. Simplify uacce flag: UACCE_DEV_SVA. Address Greg's comments: Fix state machine, remove potential syslog triggered from user space etc. v3: Recommended by Greg, use sturct uacce_device instead of struct uacce, and use struct *cdev in struct uacce_device, as a result, cdev can be released by itself when refcount decreased to 0. So the two structures are decoupled and self-maintained by themsleves. Also add dev.release for put_device. v2: Address comments from Greg and Jonathan Modify interface uacce_register Drop noiommu mode first v1: 1. Rebase to 5.3-rc1 2. Build on iommu interface 3. Verifying with Jean's sva and Eric's nested mode iommu. 4. User library has developed a lot: support zlib, openssl etc. 5. Move to misc first RFC3: https://lkml.org/lkml/2018/11/12/1951 RFC2: https://lwn.net/Articles/763990/ Background of why Uacce: Von Neumann processor is not good at general data manipulation. It is designed for control-bound rather than data-bound application. The latter need less control path facility and more/specific ALUs. So there are more and more heterogeneous processors, such as encryption/decryption accelerators, TPUs, or EDGE (Explicated Data Graph Execution) processors, introduced to gain better performance or power efficiency for particular applications these days. There are generally two ways to make use of these heterogeneous processors: The first is to make them co-processors, just like FPU. This is good for some application but it has its own cons: It changes the ISA set permanently. You must save all state elements when the process is switched out. But most data-bound processors have a huge set of state elements. It makes the kernel scheduler more complex. The second is Accelerator. It is taken as a IO device from the CPU's point of view (but it need not to be physically). The process, running on CPU, hold a context of the accelerator and send instructions to it as if it calls a function or thread running with FPU. The context is bound with the processor itself. So the state elements remain in the hardware context until the context is released. We believe this is the core feature of an "Accelerator" vs. Co-processor or other heterogeneous processors. The intention of Uacce is to provide the basic facility to backup this scenario. Its first step is to make sure the accelerator and process can share the same address space. So the accelerator ISA can directly address any data structure of the main CPU. This differs from the data sharing between CPU and IO device, which share data content rather than address. So it is different comparing to the other DMA libraries. In the future, we may add more facility to support linking accelerator library to the main application, or managing the accelerator context as special thread. But no matter how, this can be a solid start point for new processor to be used as an "accelerator" as this is the essential requirement. Kenneth Lee (2): uacce: Add documents for uacce uacce: add uacce driver Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ Documentation/misc-devices/uacce.rst | 297 drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1013 ++ include/linux/uacce.h| 167 + include/uapi/misc/uacce.h| 36 + 9 files changed, 1577 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 Documentation/misc-devices/uacce.rst create mode 100644 drivers/misc/uacce/Kconf
[PATCH v4 2/2] uacce: add uacce driver
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Since unified address, hardware and user space of process can share the same virtual address in the communication. Uacce create a chrdev for every registration, the queue is allocated to the process when the chrdev is opened. Then the process can access the hardware resource by interact with the queue file. By mmap the queue file space to user space, the process can directly put requests to the hardware without syscall to the kernel space. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1028 ++ include/linux/uacce.h| 156 include/uapi/misc/uacce.h| 40 + 8 files changed, 1288 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce.h diff --git a/Documentation/ABI/testing/sysfs-driver-uacce b/Documentation/ABI/testing/sysfs-driver-uacce new file mode 100644 index 000..563f55c --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-uacce @@ -0,0 +1,47 @@ +What: /sys/class/uacce/hisi_zip-/id +Date: Sep 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the device. + +What: /sys/class/uacce/hisi_zip-/api +Date: Sep 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Api of the device, used by application to match the correct driver + +What: /sys/class/uacce/hisi_zip-/flags +Date: Sep 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h + +What: /sys/class/uacce/hisi_zip-/available_instances +Date: Sep 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Available instances left of the device + +What: /sys/class/uacce/hisi_zip-/algorithms +Date: Sep 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Algorithms supported by this accelerator + +What: /sys/class/uacce/hisi_zip-/qfrs_offset +Date: Sep 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Page offsets of each queue file regions + +What: /sys/class/uacce/hisi_zip-/numa_distance +Date: Sep 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Distance of device node to cpu node + +What: /sys/class/uacce/hisi_zip-/node_id +Date: Sep 2019 +KernelVersion: 5.4 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the numa node diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index 1690035..94d363c 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -503,4 +503,5 @@ source "drivers/misc/cxl/Kconfig" source "drivers/misc/ocxl/Kconfig" source "drivers/misc/cardreader/Kconfig" source "drivers/misc/habanalabs/Kconfig" +source "drivers/misc/uacce/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index abd8ae2..93a131b 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -58,4 +58,5 @@ obj-$(CONFIG_OCXL)+= ocxl/ obj-y += cardreader/ obj-$(CONFIG_PVPANIC) += pvpanic.o obj-$(CONFIG_HABANA_AI)+= habanalabs/ +obj-$(CONFIG_UACCE)+= uacce/ obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o diff --git a/drivers/misc/uacce/Kconfig b/drivers/misc/uacce/Kconfig new file mode 100644 index 000..e854354 --- /dev/null +++ b/drivers/misc/uacce/Kconfig @@ -0,0 +1,13 @@ +config UACCE + tristate "Accelerator Framework for User Land" + depends on IOMMU_API + help + UACCE provides interface for the user process to access the hardware + without interaction with the kernel space in data path. + + The user-space interface is described in +
[PATCH v4 0/2] Add uacce module for Accelerator
Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Because of unified address, hardware and user space of process can share the same virtual address in the communication. Uacce is intended to be used with Jean Philippe Brucker's SVA patchset[1], which enables IO side page fault and PASID support. We have keep verifying with Jean's sva/current [2] We also keep verifying with Eric's SMMUv3 Nested Stage patch [3] This series and related zip & qm driver https://github.com/Linaro/linux-kernel-warpdrive/tree/5.3-uacce-v4 The library and user application: https://github.com/Linaro/warpdrive/tree/wdprd-v1-upstream References: [1] http://jpbrucker.net/sva/ [2] http://www.linux-arm.org/git?p=linux-jpb.git;a=shortlog;h=refs/heads/sva/current [3] https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9 Change History: v4: Based on 5.3 Address Greg comments: Fix state machine, remove potential syslog triggered from user space etc. v3: Recommended by Greg, use sturct uacce_device instead of struct uacce, and use struct *cdev in struct uacce_device, as a result, cdev can be released by itself when refcount decreased to 0. So the two structures are decoupled and self-maintained by themsleves. Also add dev.release for put_device. v2: Address comments from Greg and Jonathan Modify interface uacce_register Drop noiommu mode first v1: 1. Rebase to 5.3-rc1 2. Build on iommu interface 3. Verifying with Jean's sva and Eric's nested mode iommu. 4. User library has developed a lot: support zlib, openssl etc. 5. Move to misc first RFC3: https://lkml.org/lkml/2018/11/12/1951 RFC2: https://lwn.net/Articles/763990/ Background of why Uacce: Von Neumann processor is not good at general data manipulation. It is designed for control-bound rather than data-bound application. The latter need less control path facility and more/specific ALUs. So there are more and more heterogeneous processors, such as encryption/decryption accelerators, TPUs, or EDGE (Explicated Data Graph Execution) processors, introduced to gain better performance or power efficiency for particular applications these days. There are generally two ways to make use of these heterogeneous processors: The first is to make them co-processors, just like FPU. This is good for some application but it has its own cons: It changes the ISA set permanently. You must save all state elements when the process is switched out. But most data-bound processors have a huge set of state elements. It makes the kernel scheduler more complex. The second is Accelerator. It is taken as a IO device from the CPU's point of view (but it need not to be physically). The process, running on CPU, hold a context of the accelerator and send instructions to it as if it calls a function or thread running with FPU. The context is bound with the processor itself. So the state elements remain in the hardware context until the context is released. We believe this is the core feature of an "Accelerator" vs. Co-processor or other heterogeneous processors. The intention of Uacce is to provide the basic facility to backup this scenario. Its first step is to make sure the accelerator and process can share the same address space. So the accelerator ISA can directly address any data structure of the main CPU. This differs from the data sharing between CPU and IO device, which share data content rather than address. So it is different comparing to the other DMA libraries. In the future, we may add more facility to support linking accelerator library to the main application, or managing the accelerator context as special thread. But no matter how, this can be a solid start point for new processor to be used as an "accelerator" as this is the essential requirement. Kenneth Lee (2): uacce: Add documents for uacce uacce: add uacce driver Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ Documentation/misc-devices/uacce.rst | 309 drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1038 ++ include/linux/uacce.h| 156 include/uapi/misc/uacce.h| 40 + 9 files changed, 1607 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 Documentation/misc-devices/uacce.rst create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/m
[PATCH v4 1/2] uacce: Add documents for uacce
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) is a kernel module targets to provide Shared Virtual Addressing (SVA) between the accelerator and process. This patch add document to explain how it works. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/misc-devices/uacce.rst | 308 +++ 1 file changed, 308 insertions(+) create mode 100644 Documentation/misc-devices/uacce.rst diff --git a/Documentation/misc-devices/uacce.rst b/Documentation/misc-devices/uacce.rst new file mode 100644 index 000..4fd356e --- /dev/null +++ b/Documentation/misc-devices/uacce.rst @@ -0,0 +1,308 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Introduction of Uacce += + +Uacce (Unified/User-space-access-intended Accelerator Framework) targets to +provide Shared Virtual Addressing (SVA) between accelerators and processes. +So accelerator can access any data structure of the main cpu. +This differs from the data sharing between cpu and io device, which share +data content rather than address. +Because of the unified address, hardware and user space of process can +share the same virtual address in the communication. +Uacce takes the hardware accelerator as a heterogeneous processor, while +IOMMU share the same CPU page tables and as a result the same translation +from va to pa. + +__ __ + | | | | + | User application (CPU) | | Hardware Accelerator | + |__| |__| + +| | +| va | va +V V + ____ +| | | | +| MMU| | IOMMU | +|__| |__| +| | +| | +V pa V pa +___ + | | + | Memory | + |___| + + + +Architecture + + +Uacce is the kernel module, taking charge of iommu and address sharing. +The user drivers and libraries are called WarpDrive. + +A virtual concept, queue, is used for the communication. It provides a +FIFO-like interface. And it maintains a unified address space between the +application and all involved hardware. + + ___ +| | user API | | +| WarpDrive library | > | user driver | +|___| || + || + || + | queue fd | + || + || + v| + ___ _| +| | | | | mmap memory +| Other framework | | uacce | | r/w interface +| crypto/nic/others | |_| | +|___| | + | || + | register | register | + | || + | || + |_ __ | + | | | | | | + - | Device Driver | | IOMMU | | + |_| |__| | + || + |
Re: [PATCH v3 2/2] uacce: add uacce driver
On 2019/9/4 下午8:38, Greg Kroah-Hartman wrote: On Tue, Sep 03, 2019 at 12:14:47PM +0800, Zhangfei Gao wrote: From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Since unified address, hardware and user space of process can share the same virtual address in the communication. Uacce create a chrdev for every registration, the queue is allocated to the process when the chrdev is opened. Then the process can access the hardware resource by interact with the queue file. By mmap the queue file space to user space, the process can directly put requests to the hardware without syscall to the kernel space. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1096 ++ include/linux/uacce.h| 172 include/uapi/misc/uacce.h| 39 + 8 files changed, 1371 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce.h diff --git a/Documentation/ABI/testing/sysfs-driver-uacce b/Documentation/ABI/testing/sysfs-driver-uacce new file mode 100644 index 000..ee0a66e --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-uacce @@ -0,0 +1,47 @@ +What: /sys/class/uacce/hisi_zip-/id +Date: Sep 2019 +KernelVersion: 5.3 5.3 will be released in a week or so, without this file in it, so that's not ok here :( Thanks, will use 5.4 instead. Since 5.4-rc1 still need some time, can I send updated version based on 5.3-rc8 for more review. And I found smmu in 5.3-rc1 have issue, and rc8 is OK. Thanks
Re: [PATCH v3 2/2] uacce: add uacce driver
Hi, Greg Thanks for the careful review. On 2019/9/4 下午8:50, Greg Kroah-Hartman wrote: On Tue, Sep 03, 2019 at 12:14:47PM +0800, Zhangfei Gao wrote: +/** + * uacce_wake_up - Wake up the process who is waiting this queue + * @q the accelerator queue to wake up + */ +void uacce_wake_up(struct uacce_queue *q) +{ + wake_up_interruptible(&q->wait); +} +EXPORT_SYMBOL_GPL(uacce_wake_up); You are exporting things that have no in-kernel user, which is not allowed. Can you redo this patch series with an actual user of this new api you are creating? Otherwise we have no way of properly understanding just what is going on here. Also, do you really need a function to do the above? Why? This function is used by other driver to notify the poll_wait can be started, like when data in fifo is ready. I think we can directly use wake_up_interruptible(&q->wait), then the exporting is not required. +static int uacce_dev_open_check(struct uacce_device *uacce) +{ + /* +* The device can be opened once if it dose not support pasid +*/ + if (uacce->flags & UACCE_DEV_PASID) + return 0; + + if (atomic_cmpxchg(&uacce->state, UACCE_ST_INIT, UACCE_ST_OPENED) != + UACCE_ST_INIT) { + dev_info(&uacce->dev, "this device can be openned only once\n"); No, NEVER do this, it's wrong, an easy way to spam the syslog for no good reason, and flat out does not work at all. Just let userspace deal with something like this, if they want to open multiple times, let it deal with the fallout. Just like a tty device node. Yes, understand now, this print can be triggered by user space. + return -EBUSY; + } + + return 0; +} + +static int uacce_fops_open(struct inode *inode, struct file *filep) +{ + struct uacce_queue *q; + struct iommu_sva *handle = NULL; + struct uacce_device *uacce; + int ret; + int pasid = 0; + + uacce = idr_find(&uacce_idr, iminor(inode)); + if (!uacce) + return -ENODEV; + + if (atomic_read(&uacce->state) == UACCE_ST_RST) + return -EINVAL; What happens if the state changes _right_ after this check? + + if ((!uacce->ops->get_queue) || (!uacce->ops->start_queue)) + return -EINVAL; How would these two functions ever not be NULL? Why check for them in open()? Why not when you register the operations? OK, make sense. + if (!try_module_get(uacce->pdev->driver->owner)) + return -ENODEV; + + ret = uacce_dev_open_check(uacce); + if (ret) + goto open_err; + +#ifdef CONFIG_IOMMU_SVA + if (uacce->flags & UACCE_DEV_PASID) { + handle = iommu_sva_bind_device(uacce->pdev, current->mm, NULL); + if (IS_ERR(handle)) + goto open_err; + pasid = iommu_sva_get_pasid(handle); + } +#endif + ret = uacce->ops->get_queue(uacce, pasid, &q); + if (ret < 0) + goto open_err; + + q->pasid = pasid; + q->handle = handle; + q->uacce = uacce; + q->mm = current->mm; + memset(q->qfrs, 0, sizeof(q->qfrs)); + INIT_LIST_HEAD(&q->list); + init_waitqueue_head(&q->wait); + filep->private_data = q; + mutex_lock(&uacce->q_lock); + list_add(&q->q_dev, &uacce->qs); + mutex_unlock(&uacce->q_lock); + + return 0; + +open_err: + module_put(uacce->pdev->driver->owner); + return ret; +} + +static int uacce_fops_release(struct inode *inode, struct file *filep) +{ + struct uacce_queue *q = filep->private_data; + struct uacce_qfile_region *qfr; + struct uacce_device *uacce = q->uacce; + bool is_to_free_region; + int free_pages = 0; + int i; + + mutex_lock(&uacce->q_lock); + list_del(&q->q_dev); + mutex_unlock(&uacce->q_lock); + + if (atomic_read(&uacce->state) == UACCE_ST_STARTED && + uacce->ops->stop_queue) + uacce->ops->stop_queue(q); Same question as before, what happens right after this check? Do not use an atomic value thinking it is how to properly test for state changes. Use a "real" lock for this, otherwise it does not work at all. Will change this atomic, thanks + + mutex_lock(&uacce_mutex); + + for (i = 0; i < UACCE_QFRT_MAX; i++) { + qfr = q->qfrs[i]; + if (!qfr) + continue; + + is_to_free_region = false; + uacce_queue_unmap_qfr(q, qfr); + if (i == UACCE_QFRT_SS) { + list_del(&q->list); + if (list_empty(&qfr->qs)) +
[PATCH v3 1/2] uacce: Add documents for uacce
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) is a kernel module targets to provide Shared Virtual Addressing (SVA) between the accelerator and process. This patch add document to explain how it works. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/misc-devices/uacce.rst | 309 +++ 1 file changed, 309 insertions(+) create mode 100644 Documentation/misc-devices/uacce.rst diff --git a/Documentation/misc-devices/uacce.rst b/Documentation/misc-devices/uacce.rst new file mode 100644 index 000..211f796 --- /dev/null +++ b/Documentation/misc-devices/uacce.rst @@ -0,0 +1,309 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Introduction of Uacce += + +Uacce (Unified/User-space-access-intended Accelerator Framework) targets to +provide Shared Virtual Addressing (SVA) between accelerators and processes. +So accelerator can access any data structure of the main cpu. +This differs from the data sharing between cpu and io device, which share +data content rather than address. +Because of the unified address, hardware and user space of process can +share the same virtual address in the communication. +Uacce takes the hardware accelerator as a heterogeneous processor, while +IOMMU share the same CPU page tables and as a result the same translation +from va to pa. + +__ __ + | | | | + | User application (CPU) | | Hardware Accelerator | + |__| |__| + +| | +| va | va +V V + ____ +| | | | +| MMU| | IOMMU | +|__| |__| +| | +| | +V pa V pa +___ + | | + | Memory | + |___| + + + +Architecture + + +Uacce is the kernel module, taking charge of iommu and address sharing. +The user drivers and libraries are called WarpDrive. + +A virtual concept, queue, is used for the communication. It provides a +FIFO-like interface. And it maintains a unified address space between the +application and all involved hardware. + + ___ +| | user API | | +| WarpDrive library | > | user driver | +|___| || + || + || + | queue fd | + || + || + v| + ___ _| +| | | | | mmap memory +| Other framework | | uacce | | r/w interface +| crypto/nic/others | |_| | +|___| | + | || + | register | register | + | || + | || + |_ __ | + | | | | | | + - | Device Driver | | IOMMU | | + |_| |__| | + || + |
[PATCH v3 2/2] uacce: add uacce driver
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Since unified address, hardware and user space of process can share the same virtual address in the communication. Uacce create a chrdev for every registration, the queue is allocated to the process when the chrdev is opened. Then the process can access the hardware resource by interact with the queue file. By mmap the queue file space to user space, the process can directly put requests to the hardware without syscall to the kernel space. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1096 ++ include/linux/uacce.h| 172 include/uapi/misc/uacce.h| 39 + 8 files changed, 1371 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce.h diff --git a/Documentation/ABI/testing/sysfs-driver-uacce b/Documentation/ABI/testing/sysfs-driver-uacce new file mode 100644 index 000..ee0a66e --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-uacce @@ -0,0 +1,47 @@ +What: /sys/class/uacce/hisi_zip-/id +Date: Sep 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the device. + +What: /sys/class/uacce/hisi_zip-/api +Date: Sep 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Api of the device, used by application to match the correct driver + +What: /sys/class/uacce/hisi_zip-/flags +Date: Sep 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h + +What: /sys/class/uacce/hisi_zip-/available_instances +Date: Sep 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Available instances left of the device + +What: /sys/class/uacce/hisi_zip-/algorithms +Date: Sep 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Algorithms supported by this accelerator + +What: /sys/class/uacce/hisi_zip-/qfrs_offset +Date: Sep 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Page offsets of each queue file regions + +What: /sys/class/uacce/hisi_zip-/numa_distance +Date: Sep 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Distance of device node to cpu node + +What: /sys/class/uacce/hisi_zip-/node_id +Date: Sep 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the numa node diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index 6abfc8e..8073eb8 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -502,4 +502,5 @@ source "drivers/misc/cxl/Kconfig" source "drivers/misc/ocxl/Kconfig" source "drivers/misc/cardreader/Kconfig" source "drivers/misc/habanalabs/Kconfig" +source "drivers/misc/uacce/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index abd8ae2..93a131b 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -58,4 +58,5 @@ obj-$(CONFIG_OCXL)+= ocxl/ obj-y += cardreader/ obj-$(CONFIG_PVPANIC) += pvpanic.o obj-$(CONFIG_HABANA_AI)+= habanalabs/ +obj-$(CONFIG_UACCE)+= uacce/ obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o diff --git a/drivers/misc/uacce/Kconfig b/drivers/misc/uacce/Kconfig new file mode 100644 index 000..e854354 --- /dev/null +++ b/drivers/misc/uacce/Kconfig @@ -0,0 +1,13 @@ +config UACCE + tristate "Accelerator Framework for User Land" + depends on IOMMU_API + help + UACCE provides interface for the user process to access the hardware + without interaction with the kernel space in data path. + + The user-space interface is described in +
[PATCH v3 0/2] Add uacce module for Accelerator
Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Because of unified address, hardware and user space of process can share the same virtual address in the communication. Uacce is intended to be used with Jean Philippe Brucker's SVA patchset[1], which enables IO side page fault and PASID support. We have keep verifying with Jean's sva/current [2] We also keep verifying with Eric's SMMUv3 Nested Stage patch [3] This series and related zip & qm driver https://github.com/Linaro/linux-kernel-warpdrive/tree/5.3-rc1-warpdrive-v3 The library and user application: https://github.com/Linaro/warpdrive/tree/wdprd-v1-upstream References: [1] http://jpbrucker.net/sva/ [2] http://www.linux-arm.org/git?p=linux-jpb.git;a=shortlog;h=refs/heads/sva/current [3] https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9 Change History: v3: Recommended by Greg, use sturct uacce_device instead of struct uacce, and use struct *cdev in struct uacce_device, as a result, cdev can be released by itself when refcount decreased to 0. So the two structures are decoupled and self-maintained by themsleves. Also add dev.release for put_device. v2: Address comments from Greg and Jonathan Modify interface uacce_register Drop noiommu mode first v1: 1. Rebase to 5.3-rc1 2. Build on iommu interface 3. Verifying with Jean's sva and Eric's nested mode iommu. 4. User library has developed a lot: support zlib, openssl etc. 5. Move to misc first RFC3: https://lkml.org/lkml/2018/11/12/1951 RFC2: https://lwn.net/Articles/763990/ Background of why Uacce: Von Neumann processor is not good at general data manipulation. It is designed for control-bound rather than data-bound application. The latter need less control path facility and more/specific ALUs. So there are more and more heterogeneous processors, such as encryption/decryption accelerators, TPUs, or EDGE (Explicated Data Graph Execution) processors, introduced to gain better performance or power efficiency for particular applications these days. There are generally two ways to make use of these heterogeneous processors: The first is to make them co-processors, just like FPU. This is good for some application but it has its own cons: It changes the ISA set permanently. You must save all state elements when the process is switched out. But most data-bound processors have a huge set of state elements. It makes the kernel scheduler more complex. The second is Accelerator. It is taken as a IO device from the CPU's point of view (but it need not to be physically). The process, running on CPU, hold a context of the accelerator and send instructions to it as if it calls a function or thread running with FPU. The context is bound with the processor itself. So the state elements remain in the hardware context until the context is released. We believe this is the core feature of an "Accelerator" vs. Co-processor or other heterogeneous processors. The intention of Uacce is to provide the basic facility to backup this scenario. Its first step is to make sure the accelerator and process can share the same address space. So the accelerator ISA can directly address any data structure of the main CPU. This differs from the data sharing between CPU and IO device, which share data content rather than address. So it is different comparing to the other DMA libraries. In the future, we may add more facility to support linking accelerator library to the main application, or managing the accelerator context as special thread. But no matter how, this can be a solid start point for new processor to be used as an "accelerator" as this is the essential requirement. Kenneth Lee (2): uacce: Add documents for uacce uacce: add uacce driver Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ Documentation/misc-devices/uacce.rst | 309 drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1094 ++ include/linux/uacce.h| 172 include/uapi/misc/uacce.h| 39 + 9 files changed, 1678 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 Documentation/misc-devices/uacce.rst create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce.h -- 2.7.4
Re: [PATCH v2 2/2] uacce: add uacce driver
Hi, Greg On 2019/8/30 下午10:54, zhangfei wrote: On 2019/8/28 下午11:22, Greg Kroah-Hartman wrote: On Wed, Aug 28, 2019 at 09:27:56PM +0800, Zhangfei Gao wrote: +struct uacce { + const char *drv_name; + const char *algs; + const char *api_ver; + unsigned int flags; + unsigned long qf_pg_start[UACCE_QFRT_MAX]; + struct uacce_ops *ops; + struct device *pdev; + bool is_vf; + u32 dev_id; + struct cdev cdev; + struct device dev; + void *priv; + atomic_t state; + int prot; + struct mutex q_lock; + struct list_head qs; +}; At a quick glance, this problem really stood out to me. You CAN NOT have two different objects within a structure that have different lifetime rules and reference counts. You do that here with both a 'struct cdev' and a 'struct device'. Pick one or the other, but never both. I would recommend using a 'struct device' and then a 'struct cdev *'. That way you get the advantage of using the driver model properly, and then just adding your char device node pointer to "the side" which interacts with this device. Then you might want to call this "struct uacce_device" :) I think I understand now. 'struct device' and then a 'struct cdev' have different refcounts. Using 'struct cdev *', the release is not in uacce.c, but controlled by cdev itself. So uacce is decoupled with cdev. //Using 'struct cdev *' cdev_alloc->cdev_dynamic_release:kfree(p); uacce_destroy_chrdev: cdev_device_del->cdev_del(cdev)->kobject_put(&p->kobj); if (refcount--) == 0 cdev_dynamic_release->kfree(p); //Using 'struct device' cdev_init->cdev_default_release cdev is freed in uacce.c, So 'struct device' and then a 'struct cdev' are bind together, while cdev and uacce->dev have different refcount. Thanks for the patience.
Re: [PATCH v2 2/2] uacce: add uacce driver
Hi, Greg On 2019/8/29 下午5:54, Greg Kroah-Hartman wrote: On Thu, Aug 29, 2019 at 05:05:13PM +0800, zhangfei wrote: Hi, Greg On 2019/8/28 下午11:22, Greg Kroah-Hartman wrote: On Wed, Aug 28, 2019 at 09:27:56PM +0800, Zhangfei Gao wrote: +struct uacce { + const char *drv_name; + const char *algs; + const char *api_ver; + unsigned int flags; + unsigned long qf_pg_start[UACCE_QFRT_MAX]; + struct uacce_ops *ops; + struct device *pdev; + bool is_vf; + u32 dev_id; + struct cdev cdev; + struct device dev; + void *priv; + atomic_t state; + int prot; + struct mutex q_lock; + struct list_head qs; +}; At a quick glance, this problem really stood out to me. You CAN NOT have two different objects within a structure that have different lifetime rules and reference counts. You do that here with both a 'struct cdev' and a 'struct device'. Pick one or the other, but never both. I would recommend using a 'struct device' and then a 'struct cdev *'. That way you get the advantage of using the driver model properly, and then just adding your char device node pointer to "the side" which interacts with this device. Then you might want to call this "struct uacce_device" :) Here the 'struct cdev' and 'struct device' have the same lifetime and refcount. No they do not, that's impossible as refcounts are incremented from different places (i.e. userspace). They are allocated with uacce when uacce_register and freed when uacce_unregister. And that will not work. I am sorry, could I ask more about this part. * This function should be used whenever the struct cdev and the * struct device are members of the same structure whose lifetime is * managed by the struct device. From cdev_device_add comments, looks struct cdev and stuct device can be in the same structure like uacce, and uacce is released when put_device(device) Also cdev_device_del do the device_del(dev) and cdev_del(cdev). Copy: fs/char_dev.c /** * cdev_device_add() - add a char device and it's corresponding * struct device, linkink * @dev: the device structure * @cdev: the cdev structure * * cdev_device_add() adds the char device represented by @cdev to the system, * just as cdev_add does. It then adds @dev to the system using device_add * The dev_t for the char device will be taken from the struct device which * needs to be initialized first. This helper function correctly takes a * reference to the parent device so the parent will not get released until * all references to the cdev are released. * * This helper uses dev->devt for the device number. If it is not set * it will not add the cdev and it will be equivalent to device_add. * * This function should be used whenever the struct cdev and the * struct device are members of the same structure whose lifetime is * managed by the struct device. * * NOTE: Callers must assume that userspace was able to open the cdev and * can call cdev fops callbacks at any time, even if this function fails. */ int cdev_device_add(struct cdev *cdev, struct device *dev) { int rc = 0; if (dev->devt) { cdev_set_parent(cdev, &dev->kobj); rc = cdev_add(cdev, dev->devt, 1); if (rc) return rc; } rc = device_add(dev); if (rc) cdev_del(cdev); return rc; } Thanks
Re: [PATCH v2 2/2] uacce: add uacce driver
Hi, Greg On 2019/8/29 下午5:54, Greg Kroah-Hartman wrote: On Thu, Aug 29, 2019 at 05:05:13PM +0800, zhangfei wrote: Hi, Greg On 2019/8/28 下午11:22, Greg Kroah-Hartman wrote: On Wed, Aug 28, 2019 at 09:27:56PM +0800, Zhangfei Gao wrote: +struct uacce { + const char *drv_name; + const char *algs; + const char *api_ver; + unsigned int flags; + unsigned long qf_pg_start[UACCE_QFRT_MAX]; + struct uacce_ops *ops; + struct device *pdev; + bool is_vf; + u32 dev_id; + struct cdev cdev; + struct device dev; + void *priv; + atomic_t state; + int prot; + struct mutex q_lock; + struct list_head qs; +}; At a quick glance, this problem really stood out to me. You CAN NOT have two different objects within a structure that have different lifetime rules and reference counts. You do that here with both a 'struct cdev' and a 'struct device'. Pick one or the other, but never both. I would recommend using a 'struct device' and then a 'struct cdev *'. That way you get the advantage of using the driver model properly, and then just adding your char device node pointer to "the side" which interacts with this device. Then you might want to call this "struct uacce_device" :) Here the 'struct cdev' and 'struct device' have the same lifetime and refcount. No they do not, that's impossible as refcounts are incremented from different places (i.e. userspace). Yes, cdev's refcount is increased by open, from user space. Not sure whether I understand correctly, Is this correct? @@ -819,9 +819,10 @@ static int uacce_create_chrdev(struct uacce *uacce) if (ret < 0) return ret; - cdev_init(&uacce->cdev, &uacce_fops); + uacce->cdev = cdev_alloc(); + uacce->cdev->ops = &uacce_fops; uacce->dev_id = ret; - uacce->cdev.owner = THIS_MODULE; + uacce->cdev->owner = THIS_MODULE; device_initialize(&uacce->dev); uacce->dev.devt = MKDEV(MAJOR(uacce_devt), uacce->dev_id); uacce->dev.class = uacce_class; @@ -829,7 +830,7 @@ static int uacce_create_chrdev(struct uacce *uacce) uacce->dev.parent = uacce->pdev; uacce->dev.release = uacce_release; dev_set_name(&uacce->dev, "%s-%d", uacce->drv_name, uacce->dev_id); - ret = cdev_device_add(&uacce->cdev, &uacce->dev); + ret = cdev_device_add(uacce->cdev, &uacce->dev); if (ret) goto err_with_idr; @@ -843,7 +844,7 @@ static int uacce_create_chrdev(struct uacce *uacce) static void uacce_destroy_chrdev(struct uacce *uacce) { - cdev_device_del(&uacce->cdev, &uacce->dev); + cdev_device_del(uacce->cdev, &uacce->dev); put_device(&uacce->dev); } diff --git a/include/linux/uacce.h b/include/linux/uacce.h index 1892b94..39a2c4b 100644 --- a/include/linux/uacce.h +++ b/include/linux/uacce.h @@ -155,7 +155,7 @@ struct uacce { struct device *pdev; bool is_vf; u32 dev_id; - struct cdev cdev; + struct cdev *cdev; And use struct uacce_device instead of struct uacce -struct uacce *uacce_register(struct device *parent, - struct uacce_interface *interface); +struct uacce_device *uacce_register(struct device *parent, + struct uacce_interface *interface); They are allocated with uacce when uacce_register and freed when uacce_unregister. And that will not work. To make it clear, how about adding this. +static void uacce_release(struct device *dev) +{ + struct uacce *uacce = UACCE_FROM_CDEV_ATTR(dev); + + idr_remove(&uacce_idr, uacce->dev_id); + kfree(uacce); +} + static int uacce_create_chrdev(struct uacce *uacce) { int ret; @@ -819,6 +827,7 @@ static int uacce_create_chrdev(struct uacce *uacce) uacce->dev.class = uacce_class; uacce->dev.groups = uacce_dev_attr_groups; uacce->dev.parent = uacce->pdev; + uacce->dev.release = uacce_release; You have to have a release function today, otherwise you will get nasty kernel messages from the log. I don't know why you aren't seeing that already. Yes, kernel report warning after using put_device. dev_set_name(&uacce->dev, "%s-%d", uacce->drv_name, uacce->dev_id); ret = cdev_device_add(&uacce->cdev, &uacce->dev); if (ret) @@ -835,7 +844,7 @@ static int uacce_create_chrdev(struct uacce *uacce) static void uacce_destroy_chrdev(struct uacce *uacce) { cdev_device_del(&uacce->cdev, &uacce->dev); - idr_remove(&uacce_idr, uacce->dev_id); + put_device(&uacce->dev); } s
Re: [PATCH v2 2/2] uacce: add uacce driver
Hi, Greg On 2019/8/28 下午11:22, Greg Kroah-Hartman wrote: On Wed, Aug 28, 2019 at 09:27:56PM +0800, Zhangfei Gao wrote: +struct uacce { + const char *drv_name; + const char *algs; + const char *api_ver; + unsigned int flags; + unsigned long qf_pg_start[UACCE_QFRT_MAX]; + struct uacce_ops *ops; + struct device *pdev; + bool is_vf; + u32 dev_id; + struct cdev cdev; + struct device dev; + void *priv; + atomic_t state; + int prot; + struct mutex q_lock; + struct list_head qs; +}; At a quick glance, this problem really stood out to me. You CAN NOT have two different objects within a structure that have different lifetime rules and reference counts. You do that here with both a 'struct cdev' and a 'struct device'. Pick one or the other, but never both. I would recommend using a 'struct device' and then a 'struct cdev *'. That way you get the advantage of using the driver model properly, and then just adding your char device node pointer to "the side" which interacts with this device. Then you might want to call this "struct uacce_device" :) Here the 'struct cdev' and 'struct device' have the same lifetime and refcount. They are allocated with uacce when uacce_register and freed when uacce_unregister. To make it clear, how about adding this. +static void uacce_release(struct device *dev) +{ + struct uacce *uacce = UACCE_FROM_CDEV_ATTR(dev); + + idr_remove(&uacce_idr, uacce->dev_id); + kfree(uacce); +} + static int uacce_create_chrdev(struct uacce *uacce) { int ret; @@ -819,6 +827,7 @@ static int uacce_create_chrdev(struct uacce *uacce) uacce->dev.class = uacce_class; uacce->dev.groups = uacce_dev_attr_groups; uacce->dev.parent = uacce->pdev; + uacce->dev.release = uacce_release; dev_set_name(&uacce->dev, "%s-%d", uacce->drv_name, uacce->dev_id); ret = cdev_device_add(&uacce->cdev, &uacce->dev); if (ret) @@ -835,7 +844,7 @@ static int uacce_create_chrdev(struct uacce *uacce) static void uacce_destroy_chrdev(struct uacce *uacce) { cdev_device_del(&uacce->cdev, &uacce->dev); - idr_remove(&uacce_idr, uacce->dev_id); + put_device(&uacce->dev); } static int uacce_dev_match(struct device *dev, void *data) @@ -1042,8 +1051,6 @@ void uacce_unregister(struct uacce *uacce) uacce_destroy_chrdev(uacce); mutex_unlock(&uacce_mutex); - - kfree(uacce); } uacce_destroy_chrdev->put_device(&uacce->dev)->uacce_release->kfree(uacce). And find there are many examples in driver/ $ grep -rn cdev_device_add drivers/ drivers/rtc/class.c:362: err = cdev_device_add(&rtc->char_dev, &rtc->dev); rivers/gpio/gpiolib.c:1181: status = cdev_device_add(&gdev->chrdev, &gdev->dev); drivers/soc/qcom/rmtfs_mem.c:223: ret = cdev_device_add(&rmtfs_mem->cdev, &rmtfs_mem->dev); drivers/input/joydev.c:989: error = cdev_device_add(&joydev->cdev, &joydev->dev); drivers/input/mousedev.c:907: error = cdev_device_add(&mousedev->cdev, &mousedev->dev); drivers/input/evdev.c:1419: error = cdev_device_add(&evdev->cdev, &evdev->dev); like drivers/input/evdev.c, evdev is alloced with initialization of dev and cdev, and evdev is freed in release ops evdev_free struct evdev { struct device dev; struct cdev cdev; ~ }; Thanks
[PATCH v2 1/2] uacce: Add documents for uacce
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) is a kernel module targets to provide Shared Virtual Addressing (SVA) between the accelerator and process. This patch add document to explain how it works. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/misc-devices/uacce.rst | 309 +++ 1 file changed, 309 insertions(+) create mode 100644 Documentation/misc-devices/uacce.rst diff --git a/Documentation/misc-devices/uacce.rst b/Documentation/misc-devices/uacce.rst new file mode 100644 index 000..a2cbd00 --- /dev/null +++ b/Documentation/misc-devices/uacce.rst @@ -0,0 +1,309 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Introduction of Uacce += + +Uacce (Unified/User-space-access-intended Accelerator Framework) targets to +provide Shared Virtual Addressing (SVA) between accelerators and processes. +So accelerator can access any data structure of the main cpu. +This differs from the data sharing between cpu and io device, which share +data content rather than address. +Because of the unified address, hardware and user space of process can +share the same virtual address in the communication. +Uacce takes the hardware accelerator as a heterogeneous processor, while +IOMMU share the same CPU page tables and as a result the same translation +from va to pa. + +__ __ + | | | | + | User application (CPU) | | Hardware Accelerator | + |__| |__| + +| | +| va | va +V V + ____ +| | | | +| MMU| | IOMMU | +|__| |__| +| | +| | +V pa V pa +___ + | | + | Memory | + |___| + + + +Architecture + + +Uacce is the kernel module, taking charge of iommu and address sharing. +The user drivers and libraries are called WarpDrive. + +A virtual concept, queue, is used for the communication. It provides a +FIFO-like interface. And it maintains a unified address space between the +application and all involved hardware. + + ___ +| | user API | | +| WarpDrive library | > | user driver | +|___| || + || + || + | queue fd | + || + || + v| + ___ _| +| | | | | mmap memory +| Other framework | | uacce | | r/w interface +| crypto/nic/others | |_| | +|___| | + | || + | register | register | + | || + | || + |_ __ | + | | | | | | + - | Device Driver | | IOMMU | | + |_| |__| | + || + |
[PATCH v2 2/2] uacce: add uacce driver
From: Kenneth Lee Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Since unified address, hardware and user space of process can share the same virtual address in the communication. Uacce create a chrdev for every registration, the queue is allocated to the process when the chrdev is opened. Then the process can access the hardware resource by interact with the queue file. By mmap the queue file space to user space, the process can directly put requests to the hardware without syscall to the kernel space. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1086 ++ include/linux/uacce.h| 172 include/uapi/misc/uacce.h| 39 + 8 files changed, 1361 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce.h diff --git a/Documentation/ABI/testing/sysfs-driver-uacce b/Documentation/ABI/testing/sysfs-driver-uacce new file mode 100644 index 000..44e2f69 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-uacce @@ -0,0 +1,47 @@ +What: /sys/class/uacce/hisi_zip-/id +Date: Aug 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the device. + +What: /sys/class/uacce/hisi_zip-/api +Date: Aug 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Api of the device, used by application to match the correct driver + +What: /sys/class/uacce/hisi_zip-/flags +Date: Aug 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h + +What: /sys/class/uacce/hisi_zip-/available_instances +Date: Aug 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Available instances left of the device + +What: /sys/class/uacce/hisi_zip-/algorithms +Date: Aug 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Algorithms supported by this accelerator + +What: /sys/class/uacce/hisi_zip-/qfrs_offset +Date: Aug 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Page offsets of each queue file regions + +What: /sys/class/uacce/hisi_zip-/numa_distance +Date: Aug 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Distance of device node to cpu node + +What: /sys/class/uacce/hisi_zip-/node_id +Date: Aug 2019 +KernelVersion: 5.3 +Contact:linux-accelerat...@lists.ozlabs.org +Description:Id of the numa node diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index 6abfc8e..8073eb8 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -502,4 +502,5 @@ source "drivers/misc/cxl/Kconfig" source "drivers/misc/ocxl/Kconfig" source "drivers/misc/cardreader/Kconfig" source "drivers/misc/habanalabs/Kconfig" +source "drivers/misc/uacce/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index abd8ae2..93a131b 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -58,4 +58,5 @@ obj-$(CONFIG_OCXL)+= ocxl/ obj-y += cardreader/ obj-$(CONFIG_PVPANIC) += pvpanic.o obj-$(CONFIG_HABANA_AI)+= habanalabs/ +obj-$(CONFIG_UACCE)+= uacce/ obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o diff --git a/drivers/misc/uacce/Kconfig b/drivers/misc/uacce/Kconfig new file mode 100644 index 000..e854354 --- /dev/null +++ b/drivers/misc/uacce/Kconfig @@ -0,0 +1,13 @@ +config UACCE + tristate "Accelerator Framework for User Land" + depends on IOMMU_API + help + UACCE provides interface for the user process to access the hardware + without interaction with the kernel space in data path. + + The user-space interface is described in +
[PATCH v2 0/2] Add uacce module for Accelerator
Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Since unified address, hardware and user space of process can share the same virtual address in the communication. Uacce is intended to be used with Jean Philippe Brucker's SVA patchset[1], which enables IO side page fault and PASID support. We have keep verifying with Jean's sva/current [2] We also keep verifying with Eric's SMMUv3 Nested Stage patch [3] This series and related zip & qm driver https://github.com/Linaro/linux-kernel-warpdrive/tree/5.3-rc1-warpdrive-v2 The library and user application: https://github.com/Linaro/warpdrive/tree/5.3-rc1-v2 References: [1] http://jpbrucker.net/sva/ [2] http://www.linux-arm.org/git?p=linux-jpb.git;a=shortlog;h=refs/heads/sva/current [3] https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9 Change History: v2: Address comments from Greg and Jonathan Modify interface uacce_register Drop noiommu mode first v1: 1. Rebase to 5.3-rc1 2. Build on iommu interface 3. Verifying with Jean's sva and Eric's nested mode iommu. 4. User library has developed a lot: support zlib, openssl etc. 5. Move to misc first RFC3: https://lkml.org/lkml/2018/11/12/1951 RFC2: https://lwn.net/Articles/763990/ Background of why Uacce: Von Neumann processor is not good at general data manipulation. It is designed for control-bound rather than data-bound application. The latter need less control path facility and more/specific ALUs. So there are more and more heterogeneous processors, such as encryption/decryption accelerators, TPUs, or EDGE (Explicated Data Graph Execution) processors, introduced to gain better performance or power efficiency for particular applications these days. There are generally two ways to make use of these heterogeneous processors: The first is to make them co-processors, just like FPU. This is good for some application but it has its own cons: It changes the ISA set permanently. You must save all state elements when the process is switched out. But most data-bound processors have a huge set of state elements. It makes the kernel scheduler more complex. The second is Accelerator. It is taken as a IO device from the CPU's point of view (but it need not to be physically). The process, running on CPU, hold a context of the accelerator and send instructions to it as if it calls a function or thread running with FPU. The context is bound with the processor itself. So the state elements remain in the hardware context until the context is released. We believe this is the core feature of an "Accelerator" vs. Co-processor or other heterogeneous processors. The intention of Uacce is to provide the basic facility to backup this scenario. Its first step is to make sure the accelerator and process can share the same address space. So the accelerator ISA can directly address any data structure of the main CPU. This differs from the data sharing between CPU and IO device, which share data content rather than address. So it is different comparing to the other DMA libraries. In the future, we may add more facility to support linking accelerator library to the main application, or managing the accelerator context as special thread. But no matter how, this can be a solid start point for new processor to be used as an "accelerator" as this is the essential requirement. Kenneth Lee (2): uacce: Add documents for uacce uacce: add uacce driver Documentation/ABI/testing/sysfs-driver-uacce | 47 ++ Documentation/misc-devices/uacce.rst | 335 drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1086 ++ include/linux/uacce.h| 172 include/uapi/misc/uacce.h| 39 + 9 files changed, 1696 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce create mode 100644 Documentation/misc-devices/uacce.rst create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce.h -- 2.7.4
Re: [PATCH 2/2] uacce: add uacce module
On 2019/8/20 下午10:33, Greg Kroah-Hartman wrote: On Tue, Aug 20, 2019 at 08:36:50PM +0800, zhangfei wrote: Hi, Greg On 2019/8/19 下午6:24, Greg Kroah-Hartman wrote: +static int uacce_create_chrdev(struct uacce *uacce) +{ + int ret; + + ret = idr_alloc(&uacce_idr, uacce, 0, 0, GFP_KERNEL); + if (ret < 0) + return ret; + Shouldn't this function create the memory needed for this structure? You are relying ont he caller to do it for you, why? I think you mean uacce structure here. Yes, currently we count on caller to prepare uacce structure and call uacce_register(uacce). We still think this method is simpler, prepare uacce, register uacce. And there are other system using the same method, like crypto (crypto_register_acomp), nand, etc. crypto is not a subsystem to ever try to emulate :) You are creating a structure with a lifetime that you control, don't have someone else create your memory, that's almost never what you want to do. Most all driver subsystems create their own memory chunks for what they need to do, it's a much better pattern. Especially when you get into pointer lifetime issues... OK, understand now, thanks for your patience. will use this instead. struct uacce_interface { char name[32]; unsigned int flags; struct uacce_ops *ops; }; struct uacce *uacce_register(struct device *dev, struct uacce_interface *interface); What? Why do you need a structure? A pointer to the name and the ops should be all that is needed, right? We are thinking transfer structure will be more flexible. And modify api later would be difficult, requiring many drivers modify together. Currently parameters need a flag, a pointer to the name, and ops, but in case more requirement from future drivers usage. Also refer usb_register_dev, sdhci_pltfm_init etc, and the structure para can be set as static. And 'dev' here is a pointer to the parent, right? Might want to make that explicit in the name of the variable :) Yes, 'dev' is parent, will change to 'pdev', thanks. + +static int uacce_dev_match(struct device *dev, void *data) +{ + if (dev->parent == data) + return -EBUSY; There should be in-kernel functions for this now, no need for you to roll your own. Sorry, do not find this function. Only find class_find_device, which still require match. It is in linux-next, look there... Suppose you mean the funcs: device_match_name, device_match_of_node,device_match_devt etc. Here we need dev->parent, there still no such func. You should NEVER be matching on a parent. If so, your use of the driver model is wrong :) Remind me to really review the use of the driver core code in your next submission of this series please, I think it needs it. OK, thanks Greg.
Re: [PATCH 2/2] uacce: add uacce module
Hi, Jonathan Thanks for your careful review and good suggestion. Sorry for late response, I am checking one detail. On 2019/8/16 上午12:54, Jonathan Cameron wrote: On Wed, 14 Aug 2019 17:34:25 +0800 Zhangfei Gao wrote: From: Kenneth Lee Uacce is the kernel component to support WarpDrive accelerator framework. It provides register/unregister interface for device drivers to expose their hardware resource to the user space. The resource is taken as "queue" in WarpDrive. It's a bit confusing to have both the term UACCE and WarpDrive in here. I'd just use the uacce name in all comments etc. Yes, make sense Uacce create a chrdev for every registration, the queue is allocated to the process when the chrdev is opened. Then the process can access the hardware resource by interact with the queue file. By mmap the queue file space to user space, the process can directly put requests to the hardware without syscall to the kernel space. Uacce also manages unify addresses between the hardware and user space of the process. So they can share the same virtual address in the communication. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao I would strip this back to which ever case is of most interest (SVA I guess?) and only think about adding support for the others if necessary at a later date. (or in later patches). Do you mean split the patch and send sva part first? + +static int uacce_qfr_alloc_pages(struct uacce_qfile_region *qfr) +{ + int gfp_mask = GFP_ATOMIC | __GFP_ZERO; More readable to just have this inline. Yes, all right. + int i, j; + + qfr->pages = kcalloc(qfr->nr_pages, sizeof(*qfr->pages), gfp_mask); kcalloc is always set to zero anyway. OK + +static struct uacce_qfile_region * +uacce_create_region(struct uacce_queue *q, struct vm_area_struct *vma, + enum uacce_qfrt type, unsigned int flags) +{ + struct uacce_qfile_region *qfr; + struct uacce *uacce = q->uacce; + unsigned long vm_pgoff; + int ret = -ENOMEM; + + dev_dbg(uacce->pdev, "create qfr (type=%x, flags=%x)\n", type, flags); + qfr = kzalloc(sizeof(*qfr), GFP_ATOMIC); + if (!qfr) + return ERR_PTR(-ENOMEM); + + qfr->type = type; + qfr->flags = flags; + qfr->iova = vma->vm_start; + qfr->nr_pages = vma_pages(vma); + + if (vma->vm_flags & VM_READ) + qfr->prot |= IOMMU_READ; + + if (vma->vm_flags & VM_WRITE) + qfr->prot |= IOMMU_WRITE; + + if (flags & UACCE_QFRF_SELFMT) { + ret = uacce->ops->mmap(q, vma, qfr); + if (ret) + goto err_with_qfr; + return qfr; + } + + /* allocate memory */ + if (flags & UACCE_QFRF_DMA) { + dev_dbg(uacce->pdev, "allocate dma %d pages\n", qfr->nr_pages); + qfr->kaddr = dma_alloc_coherent(uacce->pdev, qfr->nr_pages << + PAGE_SHIFT, &qfr->dma, + GFP_KERNEL); + if (!qfr->kaddr) { + ret = -ENOMEM; + goto err_with_qfr; + } + } else { + dev_dbg(uacce->pdev, "allocate %d pages\n", qfr->nr_pages); + ret = uacce_qfr_alloc_pages(qfr); + if (ret) + goto err_with_qfr; + } + + /* map to device */ + ret = uacce_queue_map_qfr(q, qfr); + if (ret) + goto err_with_pages; + + /* mmap to user space */ + if (flags & UACCE_QFRF_MMAP) { + if (flags & UACCE_QFRF_DMA) { + + /* dma_mmap_coherent() requires vm_pgoff as 0 +* restore vm_pfoff to initial value for mmap() +*/ + dev_dbg(uacce->pdev, "mmap dma qfr\n"); + vm_pgoff = vma->vm_pgoff; + vma->vm_pgoff = 0; + ret = dma_mmap_coherent(uacce->pdev, vma, qfr->kaddr, + qfr->dma, + qfr->nr_pages << PAGE_SHIFT); Does setting vm_pgoff if (ret) make sense? Since we need restore the environment, so restore vm_pgoff no matter succeed or not. + vma->vm_pgoff = vm_pgoff; + } else { + ret = uacce_queue_mmap_qfr(q, qfr, vma); + } + + if (ret) + goto err_with_mapped_qfr; + } + + return qfr; + +err_with_mapped_qfr: + uacce_queue_unmap_qfr(q, qfr); +err_with_pages: + if (flags & UACCE_QFRF_DMA)
Re: [PATCH 2/2] uacce: add uacce module
On 2019/8/21 下午5:17, Greg Kroah-Hartman wrote: On Wed, Aug 21, 2019 at 03:21:18PM +0800, zhangfei@foxmail.com wrote: Hi, Greg On 2019/8/21 上午12:59, Greg Kroah-Hartman wrote: On Tue, Aug 20, 2019 at 09:08:55PM +0800, zhangfei wrote: On 2019/8/15 下午10:13, Greg Kroah-Hartman wrote: On Wed, Aug 14, 2019 at 05:34:25PM +0800, Zhangfei Gao wrote: +int uacce_register(struct uacce *uacce) +{ + int ret; + + if (!uacce->pdev) { + pr_debug("uacce parent device not set\n"); + return -ENODEV; + } + + if (uacce->flags & UACCE_DEV_NOIOMMU) { + add_taint(TAINT_CRAP, LOCKDEP_STILL_OK); + dev_warn(uacce->pdev, +"Register to noiommu mode, which export kernel data to user space and may vulnerable to attack"); + } THat is odd, why even offer this feature then if it is a major issue? UACCE_DEV_NOIOMMU maybe confusing here. In this mode, app use ioctl to get dma_handle from dma_alloc_coherent. That's odd, why not use the other default apis to do that? It does not matter iommu is enabled or not. In case iommu is disabled, it maybe dangerous to kernel, so we added warning here, is it required? You should use the other documentated apis for this, don't create your own. I am sorry, not understand here. Do you mean there is a standard ioctl or standard api in user space, it can get dma_handle from dma_alloc_coherent from kernel? There should be a standard way to get such a handle from userspace today. Isn't that what the ion interface does? DRM also does this, as does UIO I think. Thanks Greg, Still not find it, will do more search. But this may introduce dependency in our lib, like depend on ion? Do you have a spec somewhere that shows exactly what you are trying to do here, along with example userspace code? It's hard to determine it given you only have one "half" of the code here and no users of the apis you are creating. The purpose is doing dma in user space. If sva is supported, we can directly use malloc memory. If sva is not supported, device can not recognize va, so we get dma_handle via ioctl. Sample user code is in https://github.com/Linaro/warpdrive/blob/wdprd-v1-current/test/test_hisi_zip.c hizip_wd_sched_init_cache: if (sched->qs[0].dev_flags & UACCE_DEV_NOIOMMU) { data_in = wd_get_pa_from_va(&sched->qs[0], wd_msg->data_in); data_out = wd_get_pa_from_va(&sched->qs[0], wd_msg->data_out); } else { data_in = wd_msg->data_in; data_out = wd_msg->data_out; } msg->source_addr_l = (__u64)data_in & 0x; msg->source_addr_h = (__u64)data_in >> 32; msg->dest_addr_l = (__u64)data_out & 0x; msg->dest_addr_h = (__u64)data_out >> 32; Have added some explanation in warpdrive.rst, in the first patch. The device can also be declared as UACCE_DEV_NOIOMMU. It can be used when the device has no iommu support or the iommu is set in pass through mode. In this case, the driver should map address to device by itself with DMA API. The ioctl(UACCE_CMD_GET_SS_DMA) can be used to get the physical address, though it is an untrusted and kernel-tainted behavior. Thanks
Re: [PATCH 0/2] A General Accelerator Framework, WarpDrive
Hi, Jerome Thanks for your suggestion On 2019/8/16 上午1:04, Jerome Glisse wrote: On Wed, Aug 14, 2019 at 05:34:23PM +0800, Zhangfei Gao wrote: *WarpDrive* is a general accelerator framework for the user application to access the hardware without going through the kernel in data path. WarpDrive is the name for the whole framework. The component in kernel is called uacce, meaning "Unified/User-space-access-intended Accelerator Framework". It makes use of the capability of IOMMU to maintain a unified virtual address space between the hardware and the process. WarpDrive is intended to be used with Jean Philippe Brucker's SVA patchset[1], which enables IO side page fault and PASID support. We have keep verifying with Jean's sva/current [2] We also keep verifying with Eric's SMMUv3 Nested Stage patch [3] This series and related zip & qm driver as well as dummy driver for qemu test: https://github.com/Linaro/linux-kernel-warpdrive/tree/5.3-rc1-warpdrive-v1 zip driver already been upstreamed. zip supporting uacce will be the next step. The library and user application: https://github.com/Linaro/warpdrive/tree/wdprd-v1-current Do we want a new framework ? I think that is the first question that should be answer here. Accelerator are in many forms and so far they never have been enough commonality to create a framework, even GPUs with the drm is an example of that, drm only offer share framework for the modesetting part of the GPU (as thankfully monitor connector are not specific to GPU brands :)) FPGA is another example the only common code expose to userspace is about bitstream management AFAIK. I would argue that a framework should only be created once there is enough devices with same userspace API. Meanwhile you can provide in kernel helper that allow driver to expose same API. If after a while we have enough device driver which all use that same in kernel helpers API then it will a good time to introduce a new framework. Meanwhile this will allow individual device driver to tinker with their API and maybe get to something useful to more devices in the end. Note that what i propose also allow userspace code sharing for all driver that use the same in kernel helper. Yes, we understand it is not easy for a new framework. There are discussions in rfc2 (2018/9) and rfc3 (2018/11). To make life easier, we plan to move the uacce to driver/misc to support our own product first until it is mature. Using uacce, Currently we get quite a big performance improvement in our crypto product, like zip, hpre, sec. Our final goal is "A General Accelerator Framework", which maybe ambitious. So uacce is designed to be a common framework, can be easily supporting more accelerators. And we are happy to get more requirements and make it mature. Another good point is SVA support in ongoing, http://jpbrucker.net/sva/ After sva mature, the accelerators support will be much easier. Thanks
Re: [PATCH 2/2] uacce: add uacce module
On 2019/8/15 下午10:13, Greg Kroah-Hartman wrote: On Wed, Aug 14, 2019 at 05:34:25PM +0800, Zhangfei Gao wrote: +int uacce_register(struct uacce *uacce) +{ + int ret; + + if (!uacce->pdev) { + pr_debug("uacce parent device not set\n"); + return -ENODEV; + } + + if (uacce->flags & UACCE_DEV_NOIOMMU) { + add_taint(TAINT_CRAP, LOCKDEP_STILL_OK); + dev_warn(uacce->pdev, +"Register to noiommu mode, which export kernel data to user space and may vulnerable to attack"); + } THat is odd, why even offer this feature then if it is a major issue? UACCE_DEV_NOIOMMU maybe confusing here. In this mode, app use ioctl to get dma_handle from dma_alloc_coherent. It does not matter iommu is enabled or not. In case iommu is disabled, it maybe dangerous to kernel, so we added warning here, is it required? We support two modes. 1. support sva, (va = iova) a. If pasid is supported, multi-process can be supported. b. If pasid is not supported, multi-process can NOT be supported. We borrowed va from user space to achieve a single virtual address system. 2. Can not support sva, iova != va, Here user app get dma_handle from dma_alloc_coherent via ioctl. We need this mode for two reasons: 1. This mode can support multi-process, to support openssl etc. 2. Some accelerators (crypto like zip, etc) can work without sva, just prepare data and kick start. Thanks
Re: [PATCH 2/2] uacce: add uacce module
On 2019/8/19 下午6:22, Greg Kroah-Hartman wrote: On Mon, Aug 19, 2019 at 05:09:23PM +0800, zhangfei@foxmail.com wrote: Hi, Greg Thanks for your kind suggestion. On 2019/8/15 下午10:12, Greg Kroah-Hartman wrote: On Wed, Aug 14, 2019 at 05:34:25PM +0800, Zhangfei Gao wrote: diff --git a/include/uapi/misc/uacce.h b/include/uapi/misc/uacce.h new file mode 100644 index 000..44a0a5d --- /dev/null +++ b/include/uapi/misc/uacce.h @@ -0,0 +1,44 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +#ifndef _UAPIUUACCE_H +#define _UAPIUUACCE_H + +#include +#include + +#define UACCE_CLASS_NAME "uacce" Why is this in a uapi file? User app need get attribute from /sys/class, For example: /sys/class/uacce/hisi_zip-0/xxx UACCE_CLASS_NAME here tells app which subsystem to open, /sys/class/subsystem/ But that never comes from a uapi file. No other subsystem does this. OK, got it Maybe the entry info can come from Documentation/ABI/entries Thanks
Re: [PATCH 2/2] uacce: add uacce module
Hi, Greg On 2019/8/19 下午6:24, Greg Kroah-Hartman wrote: +static int uacce_create_chrdev(struct uacce *uacce) +{ + int ret; + + ret = idr_alloc(&uacce_idr, uacce, 0, 0, GFP_KERNEL); + if (ret < 0) + return ret; + Shouldn't this function create the memory needed for this structure? You are relying ont he caller to do it for you, why? I think you mean uacce structure here. Yes, currently we count on caller to prepare uacce structure and call uacce_register(uacce). We still think this method is simpler, prepare uacce, register uacce. And there are other system using the same method, like crypto (crypto_register_acomp), nand, etc. crypto is not a subsystem to ever try to emulate :) You are creating a structure with a lifetime that you control, don't have someone else create your memory, that's almost never what you want to do. Most all driver subsystems create their own memory chunks for what they need to do, it's a much better pattern. Especially when you get into pointer lifetime issues... OK, understand now, thanks for your patience. will use this instead. struct uacce_interface { char name[32]; unsigned int flags; struct uacce_ops *ops; }; struct uacce *uacce_register(struct device *dev, struct uacce_interface *interface); +} + +static int uacce_dev_match(struct device *dev, void *data) +{ + if (dev->parent == data) + return -EBUSY; There should be in-kernel functions for this now, no need for you to roll your own. Sorry, do not find this function. Only find class_find_device, which still require match. It is in linux-next, look there... Suppose you mean the funcs: device_match_name, device_match_of_node,device_match_devt etc. Here we need dev->parent, there still no such func. Thanks
[PATCH 2/2] uacce: add uacce module
From: Kenneth Lee Uacce is the kernel component to support WarpDrive accelerator framework. It provides register/unregister interface for device drivers to expose their hardware resource to the user space. The resource is taken as "queue" in WarpDrive. Uacce create a chrdev for every registration, the queue is allocated to the process when the chrdev is opened. Then the process can access the hardware resource by interact with the queue file. By mmap the queue file space to user space, the process can directly put requests to the hardware without syscall to the kernel space. Uacce also manages unify addresses between the hardware and user space of the process. So they can share the same virtual address in the communication. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- drivers/misc/Kconfig|1 + drivers/misc/Makefile |1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1186 +++ include/linux/uacce.h | 109 include/uapi/misc/uacce.h | 44 ++ 7 files changed, 1356 insertions(+) create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce.h diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index 6abfc8e..8073eb8 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -502,4 +502,5 @@ source "drivers/misc/cxl/Kconfig" source "drivers/misc/ocxl/Kconfig" source "drivers/misc/cardreader/Kconfig" source "drivers/misc/habanalabs/Kconfig" +source "drivers/misc/uacce/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index abd8ae2..93a131b 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -58,4 +58,5 @@ obj-$(CONFIG_OCXL)+= ocxl/ obj-y += cardreader/ obj-$(CONFIG_PVPANIC) += pvpanic.o obj-$(CONFIG_HABANA_AI)+= habanalabs/ +obj-$(CONFIG_UACCE)+= uacce/ obj-$(CONFIG_XILINX_SDFEC) += xilinx_sdfec.o diff --git a/drivers/misc/uacce/Kconfig b/drivers/misc/uacce/Kconfig new file mode 100644 index 000..569669c --- /dev/null +++ b/drivers/misc/uacce/Kconfig @@ -0,0 +1,13 @@ +config UACCE + tristate "Accelerator Framework for User Land" + depends on IOMMU_API + help + UACCE provides interface for the user process to access the hardware + without interaction with the kernel space in data path. + + The user-space interface is described in + include/uapi/misc/uacce.h + + See Documentation/misc-devices/warpdrive.rst for more details. + + If you don't know what to do here, say N. diff --git a/drivers/misc/uacce/Makefile b/drivers/misc/uacce/Makefile new file mode 100644 index 000..5b4374e --- /dev/null +++ b/drivers/misc/uacce/Makefile @@ -0,0 +1,2 @@ +# SPDX-License-Identifier: GPL-2.0-or-later +obj-$(CONFIG_UACCE) += uacce.o diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c new file mode 100644 index 000..43e0c9b --- /dev/null +++ b/drivers/misc/uacce/uacce.c @@ -0,0 +1,1186 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static struct class *uacce_class; +static DEFINE_IDR(uacce_idr); +static dev_t uacce_devt; +static DEFINE_MUTEX(uacce_mutex); /* mutex to protect uacce */ + +/* lock to protect all queues management */ +static DECLARE_RWSEM(uacce_qs_lock); +#define uacce_qs_rlock() down_read(&uacce_qs_lock) +#define uacce_qs_runlock() up_read(&uacce_qs_lock) +#define uacce_qs_wlock() down_write(&uacce_qs_lock) +#define uacce_qs_wunlock() up_write(&uacce_qs_lock) + +static const struct file_operations uacce_fops; + +/* match with enum uacce_qfrt */ +static const char *const qfrt_str[] = { + "mmio", + "dko", + "dus", + "ss", + "invalid" +}; + +static const char *uacce_qfrt_str(struct uacce_qfile_region *qfr) +{ + enum uacce_qfrt type = qfr->type; + + if (type > UACCE_QFRT_INVALID) + type = UACCE_QFRT_INVALID; + + return qfrt_str[type]; +} + +/** + * uacce_wake_up - Wake up the process who is waiting this queue + * @q the accelerator queue to wake up + */ +void uacce_wake_up(struct uacce_queue *q) +{ + dev_dbg(&q->uacce->dev, "wake up\n"); + wake_up_interruptible(&q->wait); +} +EXPORT_SYMBOL_GPL(uacce_wake_up); + +static int uacce_queue_map_qfr(struct uacce_queue *q, + struct uacce_qfile_region *qfr) +{ + struct device *dev = q->uacc
[PATCH 1/2] uacce: Add documents for WarpDrive/uacce
From: Kenneth Lee WarpDrive is a general accelerator framework for the user application to access the hardware without going through the kernel in data path. The kernel component to provide kernel facility to driver for expose the user interface is called uacce. It a short name for "Unified/User-space-access-intended Accelerator Framework". This patch add document to explain how it works. Signed-off-by: Kenneth Lee Signed-off-by: Zaibo Xu Signed-off-by: Zhou Wang Signed-off-by: Zhangfei Gao --- Documentation/misc-devices/warpdrive.rst | 351 +++ 1 file changed, 351 insertions(+) create mode 100644 Documentation/misc-devices/warpdrive.rst diff --git a/Documentation/misc-devices/warpdrive.rst b/Documentation/misc-devices/warpdrive.rst new file mode 100644 index 000..14e5939 --- /dev/null +++ b/Documentation/misc-devices/warpdrive.rst @@ -0,0 +1,351 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Introduction of WarpDrive += + +*WarpDrive* is a general accelerator framework for the user application to +communicate with the hardware without going through the kernel in data path. + +It can be used as a quick channel for accelerators, network adaptors or +other hardware for application in user space. + +It may also make some exist solution simpler. E.g. you can reuse most of the +*netdev* driver in kernel and just share some ring buffer to the user space +driver for *DPDK* or *ODP*. Or you can combine the RSA accelerator +with the *netdev* in the user space as a https reverse proxy, etc. + +*WarpDrive* takes the hardware accelerator as a heterogeneous processor which +can share particular load from the CPU: + +__ __ + | | | | + | User application (CPU) | | Hardware Accelerator | + |__| |__| + +| | +| | +V V + ____ +| | | | +| MMU| | IOMMU | +|__| |__| + \ / + \ / + \ / +__ + | | + | Memory | + |__| + + + +Architecture + + +*WarpDrive* includes general user libraries, kernel management modules +and drivers for the hardware. In kernel, the management module +is called *uacce*, meaning "Unified/User-space-access-intended +Accelerator Framework". + +A virtual concept, queue, is used for the communication. It provides a +FIFO-like interface. And it maintains a unified address space between the +application and all involved hardware. + + ___ +| | user API | | +| WarpDrive library | > | user driver | +|___| || + || + || + | queue fd | + || + || + v| + ___ _| +| | | | | mmap memory +| Other framework | | uacce | | r/w interface +| crypto/nic/others | |_| | +|___| | + | || + | register | register | + | || + | || + |_ __ | + | | | | | | + - | Device
[PATCH 0/2] A General Accelerator Framework, WarpDrive
*WarpDrive* is a general accelerator framework for the user application to access the hardware without going through the kernel in data path. WarpDrive is the name for the whole framework. The component in kernel is called uacce, meaning "Unified/User-space-access-intended Accelerator Framework". It makes use of the capability of IOMMU to maintain a unified virtual address space between the hardware and the process. WarpDrive is intended to be used with Jean Philippe Brucker's SVA patchset[1], which enables IO side page fault and PASID support. We have keep verifying with Jean's sva/current [2] We also keep verifying with Eric's SMMUv3 Nested Stage patch [3] This series and related zip & qm driver as well as dummy driver for qemu test: https://github.com/Linaro/linux-kernel-warpdrive/tree/5.3-rc1-warpdrive-v1 zip driver already been upstreamed. zip supporting uacce will be the next step. The library and user application: https://github.com/Linaro/warpdrive/tree/wdprd-v1-current Change History: v4 changed from V3 1. Rebase to 5.3-rc1 2. Build on iommu interface 3. Verifying with Jean's sva and Eric's nested mode iommu. 4. User library has developed a lot: support zlib, openssl etc. 5. Move to misc first V3 changed from V2: https://lkml.org/lkml/2018/11/12/1951 1. Build uacce from original IOMMU interface. V2 is built on VFIO. But the VFIO way locking the user memory in place will not work properly if the process fork a child. Because the copy-on-write strategy will make the parent process lost its page. This is not acceptable to accelerator user. 2. The kernel component is renamed to uacce from sdmdev accordingly 3. Document is updated for the new design. The Static Shared Virtual Memory concept is introduced to replace the User Memory Sharing concept. 4. Rebase to the lastest kernel (4.20.0-rc1) 5. As an RFC, this version is tested only with "test-to-pass" test case and not tested with Jean's SVA patch. V2 changed from V1: https://lwn.net/Articles/763990/ 1. Change kernel framework name from SPIMDEV (Share Parent IOMMU Mdev) to SDMDEV (Share Domain Mdev). 2. Allocate Hardware Resource when a new mdev is created (While it is allocated when the mdev is openned) 3. Unmap pages from the shared domain when the sdmdev iommu group is detached. (This procedure is necessary, but missed in V1) 4. Update document accordingly. 5. Rebase to the latest kernel (4.19.0-rc1) References: [1] http://jpbrucker.net/sva/ [2] http://www.linux-arm.org/git?p=linux-jpb.git;a=shortlog;h=refs/heads/sva/current [3] https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9 Kenneth Lee (2): uacce: Add documents for WarpDrive/uacce uacce: add uacce module Documentation/misc-devices/warpdrive.rst | 351 + drivers/misc/Kconfig |1 + drivers/misc/Makefile|1 + drivers/misc/uacce/Kconfig | 13 + drivers/misc/uacce/Makefile |2 + drivers/misc/uacce/uacce.c | 1186 ++ include/linux/uacce.h| 109 +++ include/uapi/misc/uacce.h| 44 ++ 8 files changed, 1707 insertions(+) create mode 100644 Documentation/misc-devices/warpdrive.rst create mode 100644 drivers/misc/uacce/Kconfig create mode 100644 drivers/misc/uacce/Makefile create mode 100644 drivers/misc/uacce/uacce.c create mode 100644 include/linux/uacce.h create mode 100644 include/uapi/misc/uacce.h -- 2.7.4
Re: linux-next: build failure after merge of the slave-dma tree
Hi, Robin On 2019/7/8 上午9:22, Robin Gong wrote: Hi Stephen, That's caused by 'of_irq_count' NOT export to global symbol, and I'm curious why it has been here for so long since Zhangfei found it in 2015. https://patchwork.kernel.org/patch/7404681/ Hi Rob, Is there something I miss so that Zhangfei's patch not accepted finally? I remembered Rob suggested us not using of_irq_count and use platform_get_irq etc. https://lkml.org/lkml/2015/11/18/466 So we remove of_irq_count commit 8c77dca011125b795bfa1c86f85a80132feee578 Author: John Garry Date: Thu Nov 19 20:23:59 2015 +0800 hisi_sas: Remove dependency on of_irq_count Thanks
答复: [RFC] a question about reuse hwpoison page in soft_offline_page()
Hi Naoya Any Update on this issue?Is there a final conclusion on how to fix this issue? Thinks -邮件原件- 发件人: Naoya Horiguchi [mailto:n-horigu...@ah.jp.nec.com] 发送时间: 2018年7月23日 14:11 收件人: Zhangfei (Tyler) 抄送: Xiexiuqi ; 裘稀石(稀石) ; linux-mm ; linux-kernel ; zy.zhengyi ; lvzhipeng ; meinanjing ; zhongjiang ; Dukaitian ; Chenglongfei 主题: Re: [RFC] a question about reuse hwpoison page in soft_offline_page() On Fri, Jul 20, 2018 at 08:50:22AM +, Zhangfei (Tyler) wrote: > Hi Naoya&xishi: > We have a similar problem, the difference is that we did not Enable > hugepage, the soft-offline was executed in the case of normal 4K pages, and > finally the MCE kill was triggered(find hwpoison flag is already set-->ret = > VM_FAULT_HWPOISON-->mm_fault_error -->do_sigbus --> mce kill). We noticed > that the new patch made some modifications to the case of huge page offline, > But how can we avoid this race condition for the case of normal page? Hi Tyler, Latest version of the fix is available on https://lkml.org/lkml/2018/7/17/60. I'm still discussing with Michal about better design of this area, but I think we'll go with this for short term fix. Thanks, Naoya Horiguchi > > -邮件原件- > 发件人: Xiexiuqi > 发送时间: 2018年7月20日 15:50 > 收件人: Naoya Horiguchi ; 裘稀石(稀石) > > 抄送: linux-mm ; linux-kernel > ; zy.zhengyi > ; Zhangfei (Tyler) > ; lvzhipeng ; meinanjing > ; zhongjiang > 主题: Re: [RFC] a question about reuse hwpoison page in > soft_offline_page() > > Hi Naoya, Xishi, > > We have a similar problem. > @zhangfei, could you please describe your problem here. > > On 2018/7/6 16:18, Naoya Horiguchi wrote: > > On Fri, Jul 06, 2018 at 11:37:41AM +0800, 裘稀石(稀石) wrote: > >> This patch add05cec > >> (mm: soft-offline: don't free target page in successful page > >> migration) removes > >> set_migratetype_isolate() and unset_migratetype_isolate() in > >> soft_offline_page (). > >> > >> And this patch 243abd5b > >> (mm: hugetlb: prevent reuse of hwpoisoned free hugepages) changes > >> if > >> (!is_migrate_isolate_page(page)) to if (!PageHWPoison(page)), so it > >> could prevent someone reuse the free hugetlb again after set the > >> hwpoison flag in soft_offline_free_page() > >> > >> My question is that if someone reuse the free hugetlb again before > >> soft_offline_free_page() and > >> after get_any_page(), then it uses the hopoison page, and this may > >> trigger mce kill later, right? > > > > Hi Xishi, > > > > Thank you for pointing out the issue. That's nice catch. > > > > I think that the race condition itself could happen, but it doesn't > > lead to MCE kill because PageHWPoison is not visible to HW which triggers > > MCE. > > PageHWPoison flag is just a flag in struct page to report the memory > > error from kernel to userspace. So even if a CPU is accessing to the > > page whose struct page has PageHWPoison set, that doesn't cause a > > MCE unless the page is physically broken. > > The type of memory error that soft offline tries to handle is > > corrected one which is not a failure yet although it's starting to wear. > > So such PageHWPoison page can be reused, but that's not critical > > because the page is freed at some point afterword and error containment > > completes. > > > > However, I noticed that there's a small pain in free hugetlb case. > > We call dissolve_free_huge_page() in soft_offline_free_page() which > > moves the PageHWPoison flag from the head page to the raw error page. > > If the reported race happens, dissolve_free_huge_page() just return > > without doing any dissolve work because "if (PageHuge(page) && > > !page_count(page))" > > block is skipped. > > The hugepage is allocated and used as usual, but the contaiment > > doesn't complete as expected in the normal page, because > > free_huge_pages() doesn't call dissolve_free_huge_page() for > > hwpoison hugepage. This is not critical because such error hugepage > > just reside in free hugepage list. But this might looks like a kind > > of memory leak. And even worse when hugepage pool is shrinked and > > the hwpoison hugepage is freed, the PageHWPoison flag is still on the head > > page which is unlikely to be an actual error page. > > > > So I think we need improvement here, how about the fix like below? > > > > (not tested yet, sorry) > > > > diff --git a
Re: [PATCH] mmc: dw_mmc-k3: Fix DDR52 mode by setting required clock divisor
+ Hisilicon colleague On 2018年04月05日 08:51, Shawn Lin wrote: [+ Zhangfei Gao who added support for hi6220] On 2018/4/4 23:31, Ryan Grachek wrote: On Tue, Apr 3, 2018 at 6:31 AM, Shawn Lin <mailto:shawn@rock-chips.com>> wrote: On 2018/3/30 2:24, oscardagrach wrote: Need at least one line commit body. Signed-off-by: oscardagrach <mailto:r...@edited.us>> --- drivers/mmc/host/dw_mmc-k3.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/mmc/host/dw_mmc-k3.c b/drivers/mmc/host/dw_mmc-k3.c index 89cdb3d533bb..efc546cb4db8 100644 --- a/drivers/mmc/host/dw_mmc-k3.c +++ b/drivers/mmc/host/dw_mmc-k3.c @@ -194,8 +194,14 @@ static void dw_mci_hi6220_set_ios(struct dw_mci *host, struct mmc_ios *ios) int ret; unsigned int clock; - clock = (ios->clock <= 2500) ? 2500 : ios->clock; - + /* CLKDIV must be 1 for DDR52/8-bit mode */ + if (ios->bus_width == MMC_BUS_WIDTH_8 && + ios->timing == MMC_TIMING_MMC_DDR52) { + mci_writel(host, CLKDIV, 0x1); + clock = ios->clock; + } else { + clock = (ios->clock <= 2500) ? 2500 : ios->clock; + } I undertand DDR52/8-bit need CLKDIV fixed 1, but shouldn't the following change is more sensible? if (ios->bus_width == MMC_BUS_WIDTH_8 && ios->timing == MMC_TIMING_MMC_DDR52) clock = ios->clock * 2; else clock = (ios->clock <= 2500) ? 2500 : ios->clock; The reason is ios->clock is 52MHz and you could claim 104MHz from the clock provider and let dw_mmc core take care of the divder to be 1. Otherwise, you just force it to be DDR52/8-bit with a clk rate of 26MHz. ret = clk_set_rate(host->biu_clk, clock); if (ret) dev_warn(host->dev, "failed to set rate %uHz\n", clock); For future wise, please use plain mode mail, but not HTML format. Your feedback is correct. I see the Rockchip dwmmc driver has a similar implementation. After applying your suggested changes, however, my board reports "dwmmc_k3 f723d000.dwmmc0: failed to set rate 10400Hz" during intialization of eMMC. In addition, I do not see CLKDIV being set to 1. clk_set_rate fails and I wonder if this is out-of-scope of the driver. If I set CLKDIV where I did prior, with your changes, the device fails to set the clock and falls back to 52 MHz (26 MHz) and works fine, but again, CLKDIV is reported as 0 (even though it is 1.) One thing of interest to note is when I manually set the clock by doing: (echo 10400 > /sys/kernel/debug/mmc0/clock) the device reports back 'mmc_host mmc0: Bus speed (slot 0) = 19840Hz (slot req 10400Hz, actual 9920HZ div = 1)' which works reliably and clk_set_rate does not report any error. When looking closely into the code, at least dw_mci_hi6220_set_ios goes wrong with the bus_hz, since it should be ciu_clk but not biu_clk. "b" stands for bus, and "c" stands for card IMHO, however bus_hz describs the clock to the card, provided by controller. Does the following patch help? diff --git a/drivers/mmc/host/dw_mmc-k3.c b/drivers/mmc/host/dw_mmc-k3.c index 89cdb3d..9e78cf2 100644 --- a/drivers/mmc/host/dw_mmc-k3.c +++ b/drivers/mmc/host/dw_mmc-k3.c @@ -194,13 +194,21 @@ static void dw_mci_hi6220_set_ios(struct dw_mci *host, struct mmc_ios *ios) int ret; unsigned int clock; - clock = (ios->clock <= 2500) ? 2500 : ios->clock; + if (ios->bus_width == MMC_BUS_WIDTH_8 && + ios->timing == MMC_TIMING_MMC_DDR52) + clock = ios->clock * 2; + else + clock = (ios->clock <= 2500) ? 2500 : ios->clock; - ret = clk_set_rate(host->biu_clk, clock); + ret = clk_set_rate(host->ciu_clk, clock); if (ret) dev_warn(host->dev, "failed to set rate %uHz\n", clock); - host->bus_hz = clk_get_rate(host->biu_clk); + clock = clk_get_rate(host->ciu_clk); + if (clock != host->bus_hz) { + host->bus_hz = clock; + host->current_speed = 0; + } } I am not sure where to begin debugging these clock issues and welcome any feedback.
Re: [PATCH v7 0/5] scsi: ufs: add ufs driver code for Hisilicon Hi3660 SoC
Hi, Wei On 2018年01月06日 17:51, Li Wei wrote: This patchset adds driver support for UFS for Hi3660 SoC. It is verified on HiKey960 board. Usually here should list the change compared with the last change set, to make it easier to reviewer, who may pay more attention to the differences. For example v7:xxx //change since v6 v6:xxx // change since v5 Li Wei (5): scsi: ufs: add Hisilicon ufs driver code dt-bindings: scsi: ufs: add document for hisi-ufs arm64: dts: add ufs dts node arm64: defconfig: enable configs for Hisilicon ufs arm64: defconfig: enable f2fs and squashfs Documentation/devicetree/bindings/ufs/ufs-hisi.txt | 43 ++ arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 20 + arch/arm64/configs/defconfig | 11 + drivers/scsi/ufs/Kconfig | 9 + drivers/scsi/ufs/Makefile | 1 + drivers/scsi/ufs/ufs-hisi.c| 621 + drivers/scsi/ufs/ufs-hisi.h| 116 7 files changed, 821 insertions(+) create mode 100644 Documentation/devicetree/bindings/ufs/ufs-hisi.txt create mode 100644 drivers/scsi/ufs/ufs-hisi.c create mode 100644 drivers/scsi/ufs/ufs-hisi.h
Re: [PATCH 09/10] dmaengine: k3dma: Use vchan_terminate_vdesc() instead of desc_free
On 2017年11月14日 22:32, Peter Ujfalusi wrote: To avoid race with vchan_complete, use the race free way to terminate running transfer. Implement the device_synchronize callback to make sure that the terminated descriptor is freed. CC: Zhangfei Gao Signed-off-by: Peter Ujfalusi Thanks Peter. Acked-by: Zhangfei Gao
Re: [PATCH] dma: k3dma: Fix non-cyclic mode
On 2017年07月11日 04:53, John Stultz wrote: From: Antonio Borneo Commit 36387a2b1f62b5c087c5fe6f0f7b23b94f722ad7 ("k3dma: Fix memory handling in preparation for cyclic mode") broke the logic around ds_run/ds_done in case of non-cyclic DMA. This went unnoticed as the only user of k3dma was the i2s audio driver, but with a patch set to enable dma on SPI, the issue cropped up. This patch resolves the issue by reverting part of the problematic commit. This patch has been tested to ensure both audio playback and SPI works fine using DMA and that no memory leak is present. Cc: Vinod Koul Cc: Dan Williams Cc: Zhangfei Gao Cc: dmaeng...@vger.kernel.org Signed-off-by: Antonio Borneo [jstultz: Expanded commit message a bit] Signed-off-by: John Stultz Acked-by: Zhangfei Gao Thanks
Re: [PATCH v2 3/3] clk: hi3660: Set PPLL2 to 2880M
On 2017年05月26日 15:38, Guodong Xu wrote: From: Zhong Kaihua Set PPLL2 to 2880M. With this patch, we saw better compatibility on various 1080p HDMI monitors. Signed-off-by: Zhong Kaihua Signed-off-by: Zheng Shaobo Acked-by: Zhangfei Gao
Re: [PATCH v2 1/3] clk: hi3660: fix wrong parent name of clk_mux_sysbus
On 2017年05月26日 15:38, Guodong Xu wrote: From: Chen Jun Parent name of clk_mux_sysbus is not correct. This patch fixes it. Signed-off-by: Chen Jun Signed-off-by: John Stultz Signed-off-by: Guodong Xu Acked-by: Zhangfei Gao