[PATCH 3/3] powerpc/powernv/npu: Remove atsd_threshold debugfs setting

2018-09-27 Thread Mark Hairgrove
This threshold is no longer used now that all invalidates issue a single ATSD to each active NPU. Signed-off-by: Mark Hairgrove --- arch/powerpc/platforms/powernv/npu-dma.c | 13 - 1 files changed, 0 insertions(+), 13 deletions(-) diff --git a/arch/powerpc/platforms/powernv/npu

[PATCH 2/3] powerpc/powernv/npu: Use size-based ATSD invalidates

2018-09-27 Thread Mark Hairgrove
tride Before After Speedup 64K 6.57.4 13% 1M 33.4 67.9 103% 2M 38.7 93.1 141% 4M356.7 354.6 -1% Anything over 2M is roughly the same as before since both cases issue a single ATSD. Signed-off-by:

[PATCH 1/3] powerpc/powernv/npu: Reduce eieio usage when issuing ATSD invalidates

2018-09-27 Thread Mark Hairgrove
7% 2M 36.3 38.7 7% 4M322.6 356.7 11% Signed-off-by: Mark Hairgrove --- arch/powerpc/platforms/powernv/npu-dma.c | 99 ++--- 1 files changed, 48 insertions(+), 51 deletions(-) diff --git a/arch/powerpc/platforms/powernv/npu

[PATCH 0/3] powerpc/powernv/npu: Improve ATSD invalidation overhead

2018-09-27 Thread Mark Hairgrove
overhead by rearranging how the ATSDs are issued and by using size-based ATSD invalidates. Mark Hairgrove (3): powerpc/powernv/npu: Reduce eieio usage when issuing ATSD invalidates powerpc/powernv/npu: Use size-based ATSD invalidates powerpc/powernv/npu: Remove atsd_threshold debugfs setting

Re: [PATCH 2/3] powerpc/powernv/npu: Use size-based ATSD invalidates

2018-10-02 Thread Mark Hairgrove
Thanks for the review. Comments below. On Tue, 2 Oct 2018, Alistair Popple wrote: > Thanks Mark, > > Looks like some worthwhile improvments to be had. I've added a couple of > comments inline below. > > > +#define PAGE_64K (64UL * 1024) +#define PAGE_2M (2UL * 1024 * 1024) > > +#define > > P

Re: [PATCH 2/3] powerpc/powernv/npu: Use size-based ATSD invalidates

2018-10-03 Thread Mark Hairgrove
On Wed, 3 Oct 2018, Alistair Popple wrote: > > > > > > We also support 4K page sizes on PPC. If I am not mistaken this means > > > every ATSD > > > would invalidate the entire GPU TLB for a the given PID on those systems. > > > Could > > > we change the above check to `if (size <= PAGE_64K)`

[PATCH v2 0/3] powerpc/powernv/npu: Improve ATSD invalidation overhead

2018-10-03 Thread Mark Hairgrove
overhead by rearranging how the ATSDs are issued and by using size-based ATSD invalidates. Mark Hairgrove (3): powerpc/powernv/npu: Reduce eieio usage when issuing ATSD invalidates powerpc/powernv/npu: Use size-based ATSD invalidates powerpc/powernv/npu: Remove atsd_threshold debugfs setting

[PATCH v2 2/3] powerpc/powernv/npu: Use size-based ATSD invalidates

2018-10-03 Thread Mark Hairgrove
354.6 -1% Anything over 2M is roughly the same as before since both cases issue a single ATSD. Signed-off-by: Mark Hairgrove --- arch/powerpc/platforms/powernv/npu-dma.c | 103 -- 1 files changed, 55 insertions(+), 48 deletions(-) diff --git a/arch/powerpc/plat

[PATCH v2 1/3] powerpc/powernv/npu: Reduce eieio usage when issuing ATSD invalidates

2018-10-03 Thread Mark Hairgrove
7% 2M 36.3 38.7 7% 4M322.6 356.7 11% Signed-off-by: Mark Hairgrove --- arch/powerpc/platforms/powernv/npu-dma.c | 99 ++--- 1 files changed, 48 insertions(+), 51 deletions(-) diff --git a/arch/powerpc/platforms/powernv/npu

[PATCH v2 3/3] powerpc/powernv/npu: Remove atsd_threshold debugfs setting

2018-10-03 Thread Mark Hairgrove
This threshold is no longer used now that all invalidates issue a single ATSD to each active NPU. Signed-off-by: Mark Hairgrove --- arch/powerpc/platforms/powernv/npu-dma.c | 14 -- 1 files changed, 0 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/platforms/powernv/npu

[PATCH] powerpc/npu-dma.c: Fix crash after __mmu_notifier_register failure

2018-02-09 Thread Mark Hairgrove
. This patch calls opal_npu_destroy_context on the failure paths, and makes sure not to assign mm->context.npu_context until past the failure points. Signed-off-by: Mark Hairgrove --- arch/powerpc/platforms/powernv/npu-dma.c | 32 +++-- 1 files changed, 21 insertions(

Re: [PATCH] powerpc/npu-dma.c: Fix deadlock in mmio_invalidate

2018-02-15 Thread Mark Hairgrove
On Wed, 14 Feb 2018, Alistair Popple wrote: > > > +struct mmio_atsd_reg { > > > + struct npu *npu; > > > + int reg; > > > +}; > > > + > > > > Is it just easier to move reg to inside of struct npu? > > I don't think so, struct npu is global to all npu contexts where as this is > specific to the

Re: [PATCH] powerpc/npu-dma.c: Fix deadlock in mmio_invalidate

2018-02-20 Thread Mark Hairgrove
On Mon, 19 Feb 2018, Balbir Singh wrote: > Good point, although I think the acquire_* function itself may be called > from a higher layer with the mmap_sem always held. I wonder if we need > barriers around get and put mmio_atsd_reg. I agree with the need for memory barriers. FWIW, page tables

Re: [PATCH 1/2] powernv/npu: Add lock to prevent race in concurrent context init/destroy

2018-04-12 Thread Mark Hairgrove
ref_put(&npu_context->kref, pnv_npu2_release_context); > + spin_unlock(&npu_context_lock); > + > + /* > + * We need to do this outside of pnv_npu2_release_context so that it is > + * outside the spinlock as mmu_notifier_destroy uses SRCU. > + */ > + if (removed) { > + mmu_notifier_unregister(&npu_context->mn, > + npu_context->mm); > + > + kfree(npu_context); > + } > + > } > EXPORT_SYMBOL(pnv_npu2_destroy_context); > > -- > 2.11.0 > > Reviewed-by: Mark Hairgrove Tested-by: Mark Hairgrove

Re: [PATCH 2/2] powernv/npu: Prevent overwriting of pnv_npu2_init_contex() callback parameters

2018-04-12 Thread Mark Hairgrove
, > + gpdev->devfn)); > + return ERR_PTR(-EINVAL); > + } > + > WARN_ON(!kref_get_unless_zero(&npu_context->kref)); > + } > spin_unlock(&npu_context_lock); > > if (!npu_context) { > -- > 2.11.0 > > Reviewed-by: Mark Hairgrove Tested-by: Mark Hairgrove