[PATCH] drm/amdgpu: fix S3 failure on specific platform

2017-07-03 Thread Ken Wang
Change-Id: Ie932508ad6949f8bfc7c8db1f5874d3440d09fc6
Signed-off-by: Ken Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h| 3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 
 2 files changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index ecc33c4..54c30fe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1702,6 +1702,9 @@ struct amdgpu_device {
/* record hw reset is performed */
bool has_hw_reset;
 
+   /* record last mm index being written through WREG32*/
+   unsigned long last_mm_index;
+
 };
 
 static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_bo_device *bdev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 21e504a..24b908c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -124,6 +124,10 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t 
reg, uint32_t v,
 {
trace_amdgpu_mm_wreg(adev->pdev->device, reg, v);
 
+   if (adev->asic_type >= CHIP_VEGA10 && reg == 0) {
+   adev->last_mm_index = v;
+   }
+
if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
BUG_ON(in_interrupt());
return amdgpu_virt_kiq_wreg(adev, reg, v);
@@ -139,6 +143,10 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t 
reg, uint32_t v,
writel(v, ((void __iomem *)adev->rmmio) + (mmMM_DATA * 4));
spin_unlock_irqrestore(>mmio_idx_lock, flags);
}
+
+   if (adev->asic_type >= CHIP_VEGA10 && reg == 1 && adev->last_mm_index 
== 0x5702C) {
+   udelay(500);
+   }
 }
 
 u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 reg)
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix S3 failure on specific platform

2017-07-03 Thread Wang, Ken
sure, I will add the delay to io_wreg.


From: Deucher, Alexander
Sent: Tuesday, July 4, 2017 12:47:38 PM
To: Wang, Ken; amd-gfx@lists.freedesktop.org
Cc: Wang, Ken
Subject: RE: [PATCH] drm/amdgpu: fix S3 failure on specific platform

> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Ken Wang
> Sent: Tuesday, July 04, 2017 12:26 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Wang, Ken
> Subject: [PATCH] drm/amdgpu: fix S3 failure on specific platform
>
> Change-Id: Ie932508ad6949f8bfc7c8db1f5874d3440d09fc6
> Signed-off-by: Ken Wang 

Do we need to add this to the io_rreg and io_wreg functions as well?  Atom uses 
them for IIO tables.  Might also want to give a brief description of the 
problem.  Something like:
Certain MC registers need a delay after writing them to properly update in the 
init sequence.
Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h| 3 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 
>  2 files changed, 11 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index ecc33c4..54c30fe 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -1702,6 +1702,9 @@ struct amdgpu_device {
>/* record hw reset is performed */
>bool has_hw_reset;
>
> + /* record last mm index being written through WREG32*/
> + unsigned long last_mm_index;
> +
>  };
>
>  static inline struct amdgpu_device *amdgpu_ttm_adev(struct
> ttm_bo_device *bdev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 21e504a..24b908c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -124,6 +124,10 @@ void amdgpu_mm_wreg(struct amdgpu_device
> *adev, uint32_t reg, uint32_t v,
>  {
>trace_amdgpu_mm_wreg(adev->pdev->device, reg, v);
>
> + if (adev->asic_type >= CHIP_VEGA10 && reg == 0) {
> + adev->last_mm_index = v;
> + }
> +
>if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
> amdgpu_sriov_runtime(adev)) {
>BUG_ON(in_interrupt());
>return amdgpu_virt_kiq_wreg(adev, reg, v);
> @@ -139,6 +143,10 @@ void amdgpu_mm_wreg(struct amdgpu_device
> *adev, uint32_t reg, uint32_t v,
>writel(v, ((void __iomem *)adev->rmmio) + (mmMM_DATA
> * 4));
>spin_unlock_irqrestore(>mmio_idx_lock, flags);
>}
> +
> + if (adev->asic_type >= CHIP_VEGA10 && reg == 1 && adev-
> >last_mm_index == 0x5702C) {
> + udelay(500);
> + }
>  }
>
>  u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 reg)
> --
> 2.7.4
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 07/12] drm: export drm_gem_prime_dmabuf_ops

2017-07-03 Thread Dave Airlie
On 4 Jul. 2017 11:23, "Michel Dänzer"  wrote:


Adding the dri-devel list, since this is a core DRM patch.


On 04/07/17 06:11 AM, Felix Kuehling wrote:
> From: Christian König 
>
> This allows drivers to check if a DMA-buf contains a GEM object or not.


Please use an accessor function. I doubt it'll be a fast path.

Dave.

>
> Signed-off-by: Christian König 
> Reviewed-by: Felix Kuehling 
> ---
>  drivers/gpu/drm/drm_prime.c | 3 ++-
>  include/drm/drmP.h  | 2 ++
>  2 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> index 25aa455..5cb4fd9 100644
> --- a/drivers/gpu/drm/drm_prime.c
> +++ b/drivers/gpu/drm/drm_prime.c
> @@ -396,7 +396,7 @@ static int drm_gem_dmabuf_mmap(struct dma_buf
*dma_buf,
>   return dev->driver->gem_prime_mmap(obj, vma);
>  }
>
> -static const struct dma_buf_ops drm_gem_prime_dmabuf_ops =  {
> +const struct dma_buf_ops drm_gem_prime_dmabuf_ops =  {
>   .attach = drm_gem_map_attach,
>   .detach = drm_gem_map_detach,
>   .map_dma_buf = drm_gem_map_dma_buf,
> @@ -410,6 +410,7 @@ static int drm_gem_dmabuf_mmap(struct dma_buf
*dma_buf,
>   .vmap = drm_gem_dmabuf_vmap,
>   .vunmap = drm_gem_dmabuf_vunmap,
>  };
> +EXPORT_SYMBOL(drm_gem_prime_dmabuf_ops);
>
>  /**
>   * DOC: PRIME Helpers
> diff --git a/include/drm/drmP.h b/include/drm/drmP.h
> index 6105c05..e0ea8f8 100644
> --- a/include/drm/drmP.h
> +++ b/include/drm/drmP.h
> @@ -761,6 +761,8 @@ static inline int drm_debugfs_remove_files(const
struct drm_info_list *files,
>
>  struct dma_buf_export_info;
>
> +extern const struct dma_buf_ops drm_gem_prime_dmabuf_ops;
> +
>  extern struct dma_buf *drm_gem_prime_export(struct drm_device *dev,
>   struct drm_gem_object *obj,
>   int flags);
>


--
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
dri-devel mailing list
dri-de...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: fix S3 failure on specific platform

2017-07-03 Thread ken.wang
From: Ken Wang 

Change-Id: Ie932508ad6949f8bfc7c8db1f5874d3440d09fc6
Signed-off-by: Ken Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h| 3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 
 2 files changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index ecc33c4..54c30fe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1702,6 +1702,9 @@ struct amdgpu_device {
/* record hw reset is performed */
bool has_hw_reset;
 
+   /* record last mm index being written through WREG32*/
+   unsigned long last_mm_index;
+
 };
 
 static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_bo_device *bdev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 21e504a..24b908c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -124,6 +124,10 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t 
reg, uint32_t v,
 {
trace_amdgpu_mm_wreg(adev->pdev->device, reg, v);
 
+   if (adev->asic_type >= CHIP_VEGA10 && reg == 0) {
+   adev->last_mm_index = v;
+   }
+
if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
BUG_ON(in_interrupt());
return amdgpu_virt_kiq_wreg(adev, reg, v);
@@ -139,6 +143,10 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t 
reg, uint32_t v,
writel(v, ((void __iomem *)adev->rmmio) + (mmMM_DATA * 4));
spin_unlock_irqrestore(>mmio_idx_lock, flags);
}
+
+   if (adev->asic_type >= CHIP_VEGA10 && reg == 1 && adev->last_mm_index 
== 0x5702C) {
+   udelay(500);
+   }
 }
 
 u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 reg)
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: fix S3 failure on specific platform

2017-07-03 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Ken Wang
> Sent: Tuesday, July 04, 2017 12:26 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Wang, Ken
> Subject: [PATCH] drm/amdgpu: fix S3 failure on specific platform
> 
> Change-Id: Ie932508ad6949f8bfc7c8db1f5874d3440d09fc6
> Signed-off-by: Ken Wang 

Do we need to add this to the io_rreg and io_wreg functions as well?  Atom uses 
them for IIO tables.  Might also want to give a brief description of the 
problem.  Something like:
Certain MC registers need a delay after writing them to properly update in the 
init sequence.
Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h| 3 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 
>  2 files changed, 11 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index ecc33c4..54c30fe 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -1702,6 +1702,9 @@ struct amdgpu_device {
>   /* record hw reset is performed */
>   bool has_hw_reset;
> 
> + /* record last mm index being written through WREG32*/
> + unsigned long last_mm_index;
> +
>  };
> 
>  static inline struct amdgpu_device *amdgpu_ttm_adev(struct
> ttm_bo_device *bdev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 21e504a..24b908c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -124,6 +124,10 @@ void amdgpu_mm_wreg(struct amdgpu_device
> *adev, uint32_t reg, uint32_t v,
>  {
>   trace_amdgpu_mm_wreg(adev->pdev->device, reg, v);
> 
> + if (adev->asic_type >= CHIP_VEGA10 && reg == 0) {
> + adev->last_mm_index = v;
> + }
> +
>   if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
> amdgpu_sriov_runtime(adev)) {
>   BUG_ON(in_interrupt());
>   return amdgpu_virt_kiq_wreg(adev, reg, v);
> @@ -139,6 +143,10 @@ void amdgpu_mm_wreg(struct amdgpu_device
> *adev, uint32_t reg, uint32_t v,
>   writel(v, ((void __iomem *)adev->rmmio) + (mmMM_DATA
> * 4));
>   spin_unlock_irqrestore(>mmio_idx_lock, flags);
>   }
> +
> + if (adev->asic_type >= CHIP_VEGA10 && reg == 1 && adev-
> >last_mm_index == 0x5702C) {
> + udelay(500);
> + }
>  }
> 
>  u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 reg)
> --
> 2.7.4
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: fix S3 failure on specific platform

2017-07-03 Thread Ken Wang
Change-Id: Ie932508ad6949f8bfc7c8db1f5874d3440d09fc6
Signed-off-by: Ken Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h| 3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 
 2 files changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index ecc33c4..54c30fe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1702,6 +1702,9 @@ struct amdgpu_device {
/* record hw reset is performed */
bool has_hw_reset;
 
+   /* record last mm index being written through WREG32*/
+   unsigned long last_mm_index;
+
 };
 
 static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_bo_device *bdev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 21e504a..24b908c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -124,6 +124,10 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t 
reg, uint32_t v,
 {
trace_amdgpu_mm_wreg(adev->pdev->device, reg, v);
 
+   if (adev->asic_type >= CHIP_VEGA10 && reg == 0) {
+   adev->last_mm_index = v;
+   }
+
if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
BUG_ON(in_interrupt());
return amdgpu_virt_kiq_wreg(adev, reg, v);
@@ -139,6 +143,10 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t 
reg, uint32_t v,
writel(v, ((void __iomem *)adev->rmmio) + (mmMM_DATA * 4));
spin_unlock_irqrestore(>mmio_idx_lock, flags);
}
+
+   if (adev->asic_type >= CHIP_VEGA10 && reg == 1 && adev->last_mm_index 
== 0x5702C) {
+   udelay(500);
+   }
 }
 
 u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 reg)
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 07/11] drm/amdgpu: rename GART to SYSVM

2017-07-03 Thread Zhou, David(ChunMing)
Distinguishing system vm and general vm is a good idea, but I'm not sure about 
renaming GTT to sysvm part, especially TTM TT stays there. Maybe we just need 
rename GART functions to SYSVM.

Regards,
David Zhou

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: Monday, July 03, 2017 5:45 PM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH 07/11] drm/amdgpu: rename GART to SYSVM

From: Christian König 

Just mass rename all names related to the hardware GART/GTT functions to SYSVM.

The name of symbols related to the TTM TT domain stay the same.

This should improve the distinction between the two.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/Kconfig |   9 +-
 drivers/gpu/drm/amd/amdgpu/Makefile|   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  58 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  48 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 423 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c|   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.c  | 423 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_test.c   |  84 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|  76 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h|   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |  30 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  |   4 +-
 drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c   |  16 +-
 drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.h   |   4 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c  |  66 ++---
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  |  70 ++---
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  |  70 ++---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  |  66 ++---
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c|  16 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.h|   4 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c |   4 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c  |   8 +-
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c  |   4 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c  |   8 +-
 24 files changed, 749 insertions(+), 748 deletions(-)
 delete mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.c

diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig 
b/drivers/gpu/drm/amd/amdgpu/Kconfig
index e8af1f5..ebbac01 100644
--- a/drivers/gpu/drm/amd/amdgpu/Kconfig
+++ b/drivers/gpu/drm/amd/amdgpu/Kconfig
@@ -31,14 +31,15 @@ config DRM_AMDGPU_USERPTR
  This option selects CONFIG_MMU_NOTIFIER if it isn't already
  selected to enabled full userptr support.
 
-config DRM_AMDGPU_GART_DEBUGFS
-   bool "Allow GART access through debugfs"
+config DRM_AMDGPU_SYSVM_DEBUGFS
+   bool "Allow SYSVM access through debugfs"
depends on DRM_AMDGPU
depends on DEBUG_FS
default n
help
- Selecting this option creates a debugfs file to inspect the mapped
- pages. Uses more memory for housekeeping, enable only for debugging.
+ Selecting this option creates a debugfs file to inspect the SYSVM
+ mapped pages. Uses more memory for housekeeping, enable only for
+ debugging.
 
 source "drivers/gpu/drm/amd/acp/Kconfig"
 source "drivers/gpu/drm/amd/display/Kconfig"
diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 3661110..d80d49f 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -22,7 +22,7 @@ amdgpu-y := amdgpu_drv.o
 # add KMS driver
 amdgpu-y += amdgpu_device.o amdgpu_kms.o \
amdgpu_atombios.o atombios_crtc.o amdgpu_connectors.o \
-   atom.o amdgpu_fence.o amdgpu_ttm.o amdgpu_object.o amdgpu_gart.o \
+   atom.o amdgpu_fence.o amdgpu_ttm.o amdgpu_object.o amdgpu_sysvm.o \
amdgpu_encoders.o amdgpu_display.o amdgpu_i2c.o \
amdgpu_fb.o amdgpu_gem.o amdgpu_ring.o \
amdgpu_cs.o amdgpu_bios.o amdgpu_benchmark.o amdgpu_test.o \
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 4a2b33d..abe191f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -305,7 +305,7 @@ struct amdgpu_vm_pte_funcs {
 };
 
 /* provided by the gmc block */
-struct amdgpu_gart_funcs {
+struct amdgpu_sysvm_funcs {
/* flush the vm tlb via mmio */
void (*flush_gpu_tlb)(struct amdgpu_device *adev,
  uint32_t vmid);
@@ -543,39 +543,39 @@ struct amdgpu_mc;
 #define AMDGPU_GPU_PAGE_SHIFT 12
 #define AMDGPU_GPU_PAGE_ALIGN(a) (((a) + AMDGPU_GPU_PAGE_MASK) & 
~AMDGPU_GPU_PAGE_MASK)
 
-struct amdgpu_gart {
+struct amdgpu_sysvm {
dma_addr_t  table_addr;
struct amdgpu_bo*robj;
void*ptr;
unsignednum_gpu_pages;
unsignednum_cpu_pages;
  

RE: [PATCH 03/12] drm/amdgpu: Enable SDMA context switching for CIK

2017-07-03 Thread Zhou, David(ChunMing)
Reviewed-by: Chunming Zhou 

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Felix 
Kuehling
Sent: Tuesday, July 04, 2017 5:11 AM
To: amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix 
Subject: [PATCH 03/12] drm/amdgpu: Enable SDMA context switching for CIK

Enable SDMA context switching on CIK (copied from sdma_v3_0.c).

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index c216e16..4a9cea0 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -342,6 +342,33 @@ static void cik_sdma_rlc_stop(struct amdgpu_device *adev)  
}
 
 /**
+ * cik_ctx_switch_enable - stop the async dma engines context switch
+ *
+ * @adev: amdgpu_device pointer
+ * @enable: enable/disable the DMA MEs context switch.
+ *
+ * Halt or unhalt the async dma engines context switch (VI).
+ */
+static void cik_ctx_switch_enable(struct amdgpu_device *adev, bool 
+enable) {
+   u32 f32_cntl;
+   int i;
+
+   for (i = 0; i < adev->sdma.num_instances; i++) {
+   f32_cntl = RREG32(mmSDMA0_CNTL + sdma_offsets[i]);
+   if (enable) {
+   f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
+   AUTO_CTXSW_ENABLE, 1);
+   } else {
+   f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
+   AUTO_CTXSW_ENABLE, 0);
+   }
+
+   WREG32(mmSDMA0_CNTL + sdma_offsets[i], f32_cntl);
+   }
+}
+
+/**
  * cik_sdma_enable - stop the async dma engines
  *
  * @adev: amdgpu_device pointer
@@ -537,6 +564,8 @@ static int cik_sdma_start(struct amdgpu_device *adev)
 
/* halt the engine before programing */
cik_sdma_enable(adev, false);
+   /* enable sdma ring preemption */
+   cik_ctx_switch_enable(adev, true);
 
/* start the gfx rings and rlc compute queues */
r = cik_sdma_gfx_resume(adev);
@@ -984,6 +1013,7 @@ static int cik_sdma_hw_fini(void *handle)  {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   cik_ctx_switch_enable(adev, false);
cik_sdma_enable(adev, false);
 
return 0;
--
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 04/12] drm/amdgpu: Make SDMA phase quantum configurable

2017-07-03 Thread Zhou, David(ChunMing)
Acked-by: Chunming Zhou 

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Felix 
Kuehling
Sent: Tuesday, July 04, 2017 5:11 AM
To: amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix 
Subject: [PATCH 04/12] drm/amdgpu: Make SDMA phase quantum configurable

Set a configurable SDMA phase quantum when enabling SDMA context switching. The 
default value significantly reduces SDMA latency in page table updates when 
user-mode SDMA queues have concurrent activity, compared to the initial HW 
setting.

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  4 
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c   | 32 ++-
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c  | 32 ++-  
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c  | 34 -
 5 files changed, 100 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 810796a..2129fbb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -106,6 +106,7 @@
 extern unsigned amdgpu_pcie_lane_cap;
 extern unsigned amdgpu_cg_mask;
 extern unsigned amdgpu_pg_mask;
+extern unsigned amdgpu_sdma_phase_quantum;
 extern char *amdgpu_disable_cu;
 extern char *amdgpu_virtual_display;
 extern unsigned amdgpu_pp_feature_mask; diff --git 
a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 4bf4a80..02cf24e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -107,6 +107,7 @@
 unsigned amdgpu_pcie_lane_cap = 0;
 unsigned amdgpu_cg_mask = 0x;
 unsigned amdgpu_pg_mask = 0x;
+unsigned amdgpu_sdma_phase_quantum = 32;
 char *amdgpu_disable_cu = NULL;
 char *amdgpu_virtual_display = NULL;
 unsigned amdgpu_pp_feature_mask = 0x; @@ -223,6 +224,9 @@  
MODULE_PARM_DESC(pg_mask, "Powergating flags mask (0 = disable power gating)"); 
 module_param_named(pg_mask, amdgpu_pg_mask, uint, 0444);
 
+MODULE_PARM_DESC(sdma_phase_quantum, "SDMA context switch phase quantum 
+(x 1K GPU clock cycles, 0 = no change (default 32))"); 
+module_param_named(sdma_phase_quantum, amdgpu_sdma_phase_quantum, uint, 
+0444);
+
 MODULE_PARM_DESC(disable_cu, "Disable CUs (se.sh.cu,...)");  
module_param_named(disable_cu, amdgpu_disable_cu, charp, 0444);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index 4a9cea0..f508f4d 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -351,14 +351,44 @@ static void cik_sdma_rlc_stop(struct amdgpu_device *adev)
  */
 static void cik_ctx_switch_enable(struct amdgpu_device *adev, bool enable)  {
-   u32 f32_cntl;
+   u32 f32_cntl, phase_quantum = 0;
int i;
 
+   if (amdgpu_sdma_phase_quantum) {
+   unsigned value = amdgpu_sdma_phase_quantum;
+   unsigned unit = 0;
+
+   while (value > (SDMA0_PHASE0_QUANTUM__VALUE_MASK >>
+   SDMA0_PHASE0_QUANTUM__VALUE__SHIFT)) {
+   value = (value + 1) >> 1;
+   unit++;
+   }
+   if (unit > (SDMA0_PHASE0_QUANTUM__UNIT_MASK >>
+   SDMA0_PHASE0_QUANTUM__UNIT__SHIFT)) {
+   value = (SDMA0_PHASE0_QUANTUM__VALUE_MASK >>
+SDMA0_PHASE0_QUANTUM__VALUE__SHIFT);
+   unit = (SDMA0_PHASE0_QUANTUM__UNIT_MASK >>
+   SDMA0_PHASE0_QUANTUM__UNIT__SHIFT);
+   WARN_ONCE(1,
+   "clamping sdma_phase_quantum to %uK clock cycles\n",
+ value << unit);
+   }
+   phase_quantum =
+   value << SDMA0_PHASE0_QUANTUM__VALUE__SHIFT |
+   unit  << SDMA0_PHASE0_QUANTUM__UNIT__SHIFT;
+   }
+
for (i = 0; i < adev->sdma.num_instances; i++) {
f32_cntl = RREG32(mmSDMA0_CNTL + sdma_offsets[i]);
if (enable) {
f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
AUTO_CTXSW_ENABLE, 1);
+   if (amdgpu_sdma_phase_quantum) {
+   WREG32(mmSDMA0_PHASE0_QUANTUM + sdma_offsets[i],
+  phase_quantum);
+   WREG32(mmSDMA0_PHASE1_QUANTUM + sdma_offsets[i],
+  phase_quantum);
+   }
} else {
f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
AUTO_CTXSW_ENABLE, 0);
diff --git 

Re: [PATCH 08/12] drm/amdgpu: disallow foreign BOs for UVD/VCE

2017-07-03 Thread Alex Deucher
On Mon, Jul 3, 2017 at 5:11 PM, Felix Kuehling  wrote:
> From: Christian König 
>
> They don't support VM mode yet.
>
> Signed-off-by: Christian König 
> Reviewed-by: Felix Kuehling 

This could probably be refined since newer asics support VM for MM
engines.  Maybe add a comment to that effect?  I would add a comment
in general either way.

Alex

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 82131d7..24035e4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -1343,7 +1343,8 @@ struct amdgpu_bo_va_mapping *
> struct amdgpu_bo_list_entry *lobj;
>
> lobj = >bo_list->array[i];
> -   if (!lobj->bo_va)
> +   if (!lobj->bo_va ||
> +   amdgpu_ttm_adev(lobj->bo_va->bo->tbo.bdev) != 
> parser->adev)
> continue;
>
> list_for_each_entry(mapping, >bo_va->valids, list) {
> --
> 1.9.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 07/12] drm: export drm_gem_prime_dmabuf_ops

2017-07-03 Thread Michel Dänzer

Adding the dri-devel list, since this is a core DRM patch.


On 04/07/17 06:11 AM, Felix Kuehling wrote:
> From: Christian König 
> 
> This allows drivers to check if a DMA-buf contains a GEM object or not.
> 
> Signed-off-by: Christian König 
> Reviewed-by: Felix Kuehling 
> ---
>  drivers/gpu/drm/drm_prime.c | 3 ++-
>  include/drm/drmP.h  | 2 ++
>  2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> index 25aa455..5cb4fd9 100644
> --- a/drivers/gpu/drm/drm_prime.c
> +++ b/drivers/gpu/drm/drm_prime.c
> @@ -396,7 +396,7 @@ static int drm_gem_dmabuf_mmap(struct dma_buf *dma_buf,
>   return dev->driver->gem_prime_mmap(obj, vma);
>  }
>  
> -static const struct dma_buf_ops drm_gem_prime_dmabuf_ops =  {
> +const struct dma_buf_ops drm_gem_prime_dmabuf_ops =  {
>   .attach = drm_gem_map_attach,
>   .detach = drm_gem_map_detach,
>   .map_dma_buf = drm_gem_map_dma_buf,
> @@ -410,6 +410,7 @@ static int drm_gem_dmabuf_mmap(struct dma_buf *dma_buf,
>   .vmap = drm_gem_dmabuf_vmap,
>   .vunmap = drm_gem_dmabuf_vunmap,
>  };
> +EXPORT_SYMBOL(drm_gem_prime_dmabuf_ops);
>  
>  /**
>   * DOC: PRIME Helpers
> diff --git a/include/drm/drmP.h b/include/drm/drmP.h
> index 6105c05..e0ea8f8 100644
> --- a/include/drm/drmP.h
> +++ b/include/drm/drmP.h
> @@ -761,6 +761,8 @@ static inline int drm_debugfs_remove_files(const struct 
> drm_info_list *files,
>  
>  struct dma_buf_export_info;
>  
> +extern const struct dma_buf_ops drm_gem_prime_dmabuf_ops;
> +
>  extern struct dma_buf *drm_gem_prime_export(struct drm_device *dev,
>   struct drm_gem_object *obj,
>   int flags);
> 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH RFC 2/2] drm/amdgpu: Set/clear CPU_ACCESS flag on page fault and move to VRAM

2017-07-03 Thread Michel Dänzer
On 03/07/17 08:47 PM, Christian König wrote:
> Am 03.07.2017 um 11:49 schrieb Michel Dänzer:
>>> Instead of messing with all this I suggest that we just add a jiffies
>>> based timeout to the BO when we can clear the flag. For kernel BOs this
>>> timeout is just infinity.
>>>
>>> Then we check in amdgpu_cs_bo_validate() before generating the
>>> placements if we could clear the flag and do so based on the timeout.
>> The idea for this patch was to save the memory and CPU cycles needed for
>> that approach.
> But when we clear the flag on the end of the move we already moved the
> BO to visible VRAM again.

Right. Clearing the flag before the move would make the flag
ineffective. We want to put the BO in CPU visible VRAM when the flag is
set, because it means the BO has been accessed by the CPU since it was
created or since it was last moved to VRAM.


> Only on after the next swapout/swapin cycle we see an effect of that
> change.

Right, clearing the flag cannot have any effect before the next time the
BO is moved to VRAM anyway.


> Is that the intended approach?

So yes, it is.


The only significant difference to the timestamp based approach (for
which John already posted patches before) is that this patch will
remember any CPU access until the next time the BO is moved to VRAM, no
matter how much time passes in between.

BTW, one issue with the timestamp based approach is that we only get a
page fault on the first CPU access after the BO was moved. So the
timestamp only says how much time has passed since the first CPU access,
not since the last CPU access.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 06/12] drm/amdgpu: Correctly establish the suspend/resume hook for amdkfd

2017-07-03 Thread Oded Gabbay
On Tue, Jul 4, 2017 at 12:11 AM, Felix Kuehling  wrote:
> From: Yong Zhao 
>
> Signed-off-by: Yong Zhao 
> Reviewed-by: Felix Kuehling 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++
>  drivers/gpu/drm/amd/amdgpu/cik.c   | 9 +
>  2 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 5b1220f..bc69b9c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -56,6 +56,8 @@
>  #include 
>  #include "amdgpu_vf_error.h"
>
> +#include "amdgpu_amdkfd.h"
> +
>  MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>  MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
>
> @@ -2397,6 +2399,8 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
> suspend, bool fbcon)
> drm_modeset_unlock_all(dev);
> }
>
> +   amdgpu_amdkfd_suspend(adev);
> +
> /* unpin the front buffers and cursors */
> list_for_each_entry(crtc, >mode_config.crtc_list, head) {
> struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
> @@ -2537,6 +2541,9 @@ int amdgpu_device_resume(struct drm_device *dev, bool 
> resume, bool fbcon)
> }
> }
> }
> +   r = amdgpu_amdkfd_resume(adev);
> +   if (r)
> +   return r;
>
> /* blat the mode back in */
> if (fbcon) {
> diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c 
> b/drivers/gpu/drm/amd/amdgpu/cik.c
> index 6ce9f80..00639bf 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cik.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cik.c
> @@ -1825,21 +1825,14 @@ static int cik_common_suspend(void *handle)
>  {
> struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
> -   amdgpu_amdkfd_suspend(adev);
> -
> return cik_common_hw_fini(adev);
>  }
>
>  static int cik_common_resume(void *handle)
>  {
> -   int r;
> struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
> -   r = cik_common_hw_init(adev);
> -   if (r)
> -   return r;
> -
> -   return amdgpu_amdkfd_resume(adev);
> +   return cik_common_hw_init(adev);
>  }
>
>  static bool cik_common_is_idle(void *handle)
> --
> 1.9.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

This patch is:
Reviewed-by: Oded Gabbay 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 08/12] drm/amdgpu: disallow foreign BOs for UVD/VCE

2017-07-03 Thread Felix Kuehling
From: Christian König 

They don't support VM mode yet.

Signed-off-by: Christian König 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 82131d7..24035e4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1343,7 +1343,8 @@ struct amdgpu_bo_va_mapping *
struct amdgpu_bo_list_entry *lobj;
 
lobj = >bo_list->array[i];
-   if (!lobj->bo_va)
+   if (!lobj->bo_va ||
+   amdgpu_ttm_adev(lobj->bo_va->bo->tbo.bdev) != parser->adev)
continue;
 
list_for_each_entry(mapping, >bo_va->valids, list) {
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 05/12] drm/amdgpu: Send no-retry XNACK for all fault types

2017-07-03 Thread Felix Kuehling
From: Jay Cornwall 

A subset of VM fault types currently send retry XNACK to the client.
This causes a storm of interrupts from the VM to the host.

Until the storm is throttled by other means send no-retry XNACK for
all fault types instead. No change in behavior to the client which
will stall indefinitely with the current configuration in any case.
Improves system stability under GC or MMHUB faults.

Signed-off-by: Jay Cornwall 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c | 3 +++
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c  | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
index a42f483..f957b18 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
@@ -206,6 +206,9 @@ static void gfxhub_v1_0_setup_vmid_config(struct 
amdgpu_device *adev)
tmp = REG_SET_FIELD(tmp, VM_CONTEXT1_CNTL,
PAGE_TABLE_BLOCK_SIZE,
adev->vm_manager.block_size - 9);
+   /* Send no-retry XNACK on fault to suppress VM fault storm. */
+   tmp = REG_SET_FIELD(tmp, VM_CONTEXT1_CNTL,
+   RETRY_PERMISSION_OR_INVALID_PAGE_FAULT, 0);
WREG32_SOC15_OFFSET(GC, 0, mmVM_CONTEXT1_CNTL, i, tmp);
WREG32_SOC15_OFFSET(GC, 0, 
mmVM_CONTEXT1_PAGE_TABLE_START_ADDR_LO32, i*2, 0);
WREG32_SOC15_OFFSET(GC, 0, 
mmVM_CONTEXT1_PAGE_TABLE_START_ADDR_HI32, i*2, 0);
diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
index 01918dc..b760018 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
@@ -222,6 +222,9 @@ static void mmhub_v1_0_setup_vmid_config(struct 
amdgpu_device *adev)
tmp = REG_SET_FIELD(tmp, VM_CONTEXT1_CNTL,
PAGE_TABLE_BLOCK_SIZE,
adev->vm_manager.block_size - 9);
+   /* Send no-retry XNACK on fault to suppress VM fault storm. */
+   tmp = REG_SET_FIELD(tmp, VM_CONTEXT1_CNTL,
+   RETRY_PERMISSION_OR_INVALID_PAGE_FAULT, 0);
WREG32_SOC15_OFFSET(MMHUB, 0, mmVM_CONTEXT1_CNTL, i, tmp);
WREG32_SOC15_OFFSET(MMHUB, 0, 
mmVM_CONTEXT1_PAGE_TABLE_START_ADDR_LO32, i*2, 0);
WREG32_SOC15_OFFSET(MMHUB, 0, 
mmVM_CONTEXT1_PAGE_TABLE_START_ADDR_HI32, i*2, 0);
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 00/12] Patches from amd-kfd-staging

2017-07-03 Thread Felix Kuehling
Varios cleaned-up and some squashed patches from amd-kfg-staging
that are not necessarily KFD-on-dGPU-specific. The intention is to
minimize differences between amd-kfd-staging and upstream before
reviewing KFD-specific changes for upstreaming.

Patches 7-12 are a rebased (multiple times) patch series by
Christian for allowing foreign BO imports for peer-to-peer buffer
access.

Amber Lin (1):
  drm/amdgpu: handle foreign BOs in the VM mapping

Christian König (5):
  drm: export drm_gem_prime_dmabuf_ops
  drm/amdgpu: disallow foreign BOs for UVD/VCE
  drm/amdgpu: disallow foreign BOs in the display path
  drm/amdgpu: separate BO from GEM object
  drm/amdgpu: enable foreign DMA-buf objects

Felix Kuehling (3):
  drm/amdgpu: implement vm_operations_struct.access
  drm/amdgpu: Enable SDMA context switching for CIK
  drm/amdgpu: Make SDMA phase quantum configurable

Jay Cornwall (1):
  drm/amdgpu: Send no-retry XNACK for all fault types

Yong Zhao (1):
  drm/amdgpu: Correctly establish the suspend/resume hook for amdkfd

shaoyunl (1):
  drm/amdgpu: Enable SDMA_CNTL.ATC_L1_ENABLE for SDMA on CZ

 drivers/gpu/drm/amd/amdgpu/amdgpu.h |  15 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |   7 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c |   6 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c |  41 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  |   7 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c   |  79 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 147 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  |  17 +++-
 drivers/gpu/drm/amd/amdgpu/cik.c|   9 +-
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c   |  60 
 drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c|   3 +
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c |   3 +
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c  |  42 +++-
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c  |  34 ++-
 drivers/gpu/drm/drm_prime.c |   3 +-
 include/drm/drmP.h  |   2 +
 19 files changed, 446 insertions(+), 40 deletions(-)

-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 07/12] drm: export drm_gem_prime_dmabuf_ops

2017-07-03 Thread Felix Kuehling
From: Christian König 

This allows drivers to check if a DMA-buf contains a GEM object or not.

Signed-off-by: Christian König 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/drm_prime.c | 3 ++-
 include/drm/drmP.h  | 2 ++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 25aa455..5cb4fd9 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -396,7 +396,7 @@ static int drm_gem_dmabuf_mmap(struct dma_buf *dma_buf,
return dev->driver->gem_prime_mmap(obj, vma);
 }
 
-static const struct dma_buf_ops drm_gem_prime_dmabuf_ops =  {
+const struct dma_buf_ops drm_gem_prime_dmabuf_ops =  {
.attach = drm_gem_map_attach,
.detach = drm_gem_map_detach,
.map_dma_buf = drm_gem_map_dma_buf,
@@ -410,6 +410,7 @@ static int drm_gem_dmabuf_mmap(struct dma_buf *dma_buf,
.vmap = drm_gem_dmabuf_vmap,
.vunmap = drm_gem_dmabuf_vunmap,
 };
+EXPORT_SYMBOL(drm_gem_prime_dmabuf_ops);
 
 /**
  * DOC: PRIME Helpers
diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index 6105c05..e0ea8f8 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -761,6 +761,8 @@ static inline int drm_debugfs_remove_files(const struct 
drm_info_list *files,
 
 struct dma_buf_export_info;
 
+extern const struct dma_buf_ops drm_gem_prime_dmabuf_ops;
+
 extern struct dma_buf *drm_gem_prime_export(struct drm_device *dev,
struct drm_gem_object *obj,
int flags);
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 10/12] drm/amdgpu: separate BO from GEM object

2017-07-03 Thread Felix Kuehling
From: Christian König 

This allows us to have multiple GEM objects for one BO.

Signed-off-by: Christian König 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h| 12 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 41 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  7 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c  | 20 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 17 +++--
 5 files changed, 77 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 2129fbb..f3d99cb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -414,6 +414,12 @@ struct amdgpu_bo_va {
 
 #define AMDGPU_GEM_DOMAIN_MAX  0x3
 
+struct amdgpu_gem_object {
+   struct drm_gem_object   base;
+   struct list_headlist;
+   struct amdgpu_bo*bo;
+};
+
 struct amdgpu_bo {
/* Protected by tbo.reserved */
u32 prefered_domains;
@@ -430,12 +436,14 @@ struct amdgpu_bo {
void*metadata;
u32 metadata_size;
unsignedprime_shared_count;
+   /* GEM objects refereing to this BO */
+   struct list_headgem_objects;
+
/* list of all virtual address to which this bo
 * is associated to
 */
struct list_headva;
/* Constant after initialization */
-   struct drm_gem_object   gem_base;
struct amdgpu_bo*parent;
struct amdgpu_bo*shadow;
 
@@ -444,7 +452,7 @@ struct amdgpu_bo {
struct list_headmn_list;
struct list_headshadow_list;
 };
-#define gem_to_amdgpu_bo(gobj) container_of((gobj), struct amdgpu_bo, gem_base)
+#define gem_to_amdgpu_bo(gobj) container_of((gobj), struct amdgpu_gem_object, 
base)->bo
 
 void amdgpu_gem_object_free(struct drm_gem_object *obj);
 int amdgpu_gem_object_open(struct drm_gem_object *obj,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 96c4493..f7e9bdf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -33,14 +33,20 @@
 
 void amdgpu_gem_object_free(struct drm_gem_object *gobj)
 {
-   struct amdgpu_bo *robj = gem_to_amdgpu_bo(gobj);
+   struct amdgpu_gem_object *aobj;
 
-   if (robj) {
-   if (robj->gem_base.import_attach)
-   drm_prime_gem_destroy(>gem_base, robj->tbo.sg);
-   amdgpu_mn_unregister(robj);
-   amdgpu_bo_unref();
-   }
+   aobj = container_of((gobj), struct amdgpu_gem_object, base);
+   if (aobj->base.import_attach)
+   drm_prime_gem_destroy(>base, aobj->bo->tbo.sg);
+
+   ww_mutex_lock(>bo->tbo.resv->lock, NULL);
+   list_del(>list);
+   ww_mutex_unlock(>bo->tbo.resv->lock);
+
+   amdgpu_mn_unregister(aobj->bo);
+   amdgpu_bo_unref(>bo);
+   drm_gem_object_release(>base);
+   kfree(aobj);
 }
 
 int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size,
@@ -49,6 +55,7 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, 
unsigned long size,
struct drm_gem_object **obj)
 {
struct amdgpu_bo *robj;
+   struct amdgpu_gem_object *gobj;
unsigned long max_size;
int r;
 
@@ -83,7 +90,23 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, 
unsigned long size,
}
return r;
}
-   *obj = >gem_base;
+
+   gobj = kzalloc(sizeof(struct amdgpu_gem_object), GFP_KERNEL);
+   if (unlikely(!gobj)) {
+   amdgpu_bo_unref();
+   return -ENOMEM;
+   }
+
+   r = drm_gem_object_init(adev->ddev, >base, amdgpu_bo_size(robj));
+   if (unlikely(r)) {
+   kfree(gobj);
+   amdgpu_bo_unref();
+   return r;
+   }
+
+   list_add(>list, >gem_objects);
+   gobj->bo = robj;
+   *obj = >base;
 
return 0;
 }
@@ -703,7 +726,7 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev, void *data,
struct drm_amdgpu_gem_create_in info;
void __user *out = (void __user *)(uintptr_t)args->value;
 
-   info.bo_size = robj->gem_base.size;
+   info.bo_size = amdgpu_bo_size(robj);
info.alignment = robj->tbo.mem.page_alignment << PAGE_SHIFT;
info.domains = robj->prefered_domains;
info.domain_flags = robj->flags;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index c34cf2c..44b7e71 100644
--- 

[PATCH 01/12] drm/amdgpu: implement vm_operations_struct.access

2017-07-03 Thread Felix Kuehling
Allows gdb to access contents of user mode mapped BOs.

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 130 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |   2 +
 2 files changed, 131 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 15148f1..3f927c2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1237,6 +1237,134 @@ void amdgpu_ttm_set_active_vram_size(struct 
amdgpu_device *adev, u64 size)
man->size = size >> PAGE_SHIFT;
 }
 
+static struct vm_operations_struct amdgpu_ttm_vm_ops;
+static const struct vm_operations_struct *ttm_vm_ops /* = NULL;
+ * (appease checkpatch) */;
+static int amdgpu_ttm_bo_access_vram(struct amdgpu_bo *abo,
+unsigned long offset,
+void *buf, int len, int write)
+{
+   struct amdgpu_device *adev = amdgpu_ttm_adev(abo->tbo.bdev);
+   struct drm_mm_node *nodes = abo->tbo.mem.mm_node;
+   uint32_t value = 0;
+   int result = 0;
+   uint64_t pos;
+   unsigned long flags;
+
+   while (offset >= (nodes->size << PAGE_SHIFT)) {
+   offset -= nodes->size << PAGE_SHIFT;
+   ++nodes;
+   }
+   pos = (nodes->start << PAGE_SHIFT) + offset;
+
+   while (len && pos < adev->mc.mc_vram_size) {
+   uint64_t aligned_pos = pos & ~(uint64_t)3;
+   uint32_t bytes = 4 - (pos & 3);
+   uint32_t shift = (pos & 3) * 8;
+   uint32_t mask = 0x << shift;
+
+   if (len < bytes) {
+   mask &= 0x >> (bytes - len) * 8;
+   bytes = len;
+   }
+
+   spin_lock_irqsave(>mmio_idx_lock, flags);
+   WREG32(mmMM_INDEX, ((uint32_t)aligned_pos) | 0x8000);
+   WREG32(mmMM_INDEX_HI, aligned_pos >> 31);
+   if (!write || mask != 0x)
+   value = RREG32(mmMM_DATA);
+   if (write) {
+   value &= ~mask;
+   value |= (*(uint32_t *)buf << shift) & mask;
+   WREG32(mmMM_DATA, value);
+   }
+   spin_unlock_irqrestore(>mmio_idx_lock, flags);
+   if (!write) {
+   value = (value & mask) >> shift;
+   memcpy(buf, , bytes);
+   }
+
+   result += bytes;
+   buf = (uint8_t *)buf + bytes;
+   pos += bytes;
+   len -= bytes;
+   if (pos >= (nodes->start + nodes->size) << PAGE_SHIFT) {
+   ++nodes;
+   pos = (nodes->start << PAGE_SHIFT);
+   }
+   }
+
+   return result;
+}
+
+static int amdgpu_ttm_bo_access_kmap(struct amdgpu_bo *abo,
+unsigned long offset,
+void *buf, int len, int write)
+{
+   struct ttm_buffer_object *bo = >tbo;
+   struct ttm_bo_kmap_obj map;
+   void *ptr;
+   bool is_iomem;
+   int r;
+
+   r = ttm_bo_kmap(bo, 0, bo->num_pages, );
+   if (r)
+   return r;
+   ptr = (uint8_t *)ttm_kmap_obj_virtual(, _iomem) + offset;
+   WARN_ON(is_iomem);
+   if (write)
+   memcpy(ptr, buf, len);
+   else
+   memcpy(buf, ptr, len);
+   ttm_bo_kunmap();
+
+   return len;
+}
+
+static int amdgpu_ttm_vm_access(struct vm_area_struct *vma, unsigned long addr,
+   void *buf, int len, int write)
+{
+   unsigned long offset = (addr) - vma->vm_start;
+   struct ttm_buffer_object *bo = vma->vm_private_data;
+   struct amdgpu_bo *abo = container_of(bo, struct amdgpu_bo, tbo);
+   unsigned domain;
+   int result;
+
+   result = amdgpu_bo_reserve(abo, false);
+   if (result != 0)
+   return result;
+
+   domain = amdgpu_mem_type_to_domain(bo->mem.mem_type);
+   if (domain == AMDGPU_GEM_DOMAIN_VRAM)
+   result = amdgpu_ttm_bo_access_vram(abo, offset,
+  buf, len, write);
+   else
+   result = amdgpu_ttm_bo_access_kmap(abo, offset,
+  buf, len, write);
+   amdgpu_bo_unreserve(abo);
+
+   return len;
+}
+
+int amdgpu_bo_mmap(struct file *filp, struct vm_area_struct *vma,
+  struct ttm_bo_device *bdev)
+{
+   int r;
+
+   r = ttm_bo_mmap(filp, vma, bdev);
+   if (unlikely(r != 0))
+   return r;
+
+   if (unlikely(ttm_vm_ops == NULL)) {
+   ttm_vm_ops = vma->vm_ops;
+   amdgpu_ttm_vm_ops = *ttm_vm_ops;
+   

[PATCH 09/12] drm/amdgpu: disallow foreign BOs in the display path

2017-07-03 Thread Felix Kuehling
From: Christian König 

Pinning them in other devices VRAM would obviously not work.

Signed-off-by: Christian König 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 3341c34..bd6b0dc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -180,6 +180,12 @@ int amdgpu_crtc_page_flip_target(struct drm_crtc *crtc,
obj = new_amdgpu_fb->obj;
new_abo = gem_to_amdgpu_bo(obj);
 
+   if (amdgpu_ttm_adev(new_abo->tbo.bdev) != adev) {
+   DRM_ERROR("Foreign BOs not allowed in the display engine\n");
+   r = -EINVAL;
+   goto cleanup;
+   }
+
/* pin the new buffer */
r = amdgpu_bo_reserve(new_abo, false);
if (unlikely(r != 0)) {
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 03/12] drm/amdgpu: Enable SDMA context switching for CIK

2017-07-03 Thread Felix Kuehling
Enable SDMA context switching on CIK (copied from sdma_v3_0.c).

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index c216e16..4a9cea0 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -342,6 +342,33 @@ static void cik_sdma_rlc_stop(struct amdgpu_device *adev)
 }
 
 /**
+ * cik_ctx_switch_enable - stop the async dma engines context switch
+ *
+ * @adev: amdgpu_device pointer
+ * @enable: enable/disable the DMA MEs context switch.
+ *
+ * Halt or unhalt the async dma engines context switch (VI).
+ */
+static void cik_ctx_switch_enable(struct amdgpu_device *adev, bool enable)
+{
+   u32 f32_cntl;
+   int i;
+
+   for (i = 0; i < adev->sdma.num_instances; i++) {
+   f32_cntl = RREG32(mmSDMA0_CNTL + sdma_offsets[i]);
+   if (enable) {
+   f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
+   AUTO_CTXSW_ENABLE, 1);
+   } else {
+   f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
+   AUTO_CTXSW_ENABLE, 0);
+   }
+
+   WREG32(mmSDMA0_CNTL + sdma_offsets[i], f32_cntl);
+   }
+}
+
+/**
  * cik_sdma_enable - stop the async dma engines
  *
  * @adev: amdgpu_device pointer
@@ -537,6 +564,8 @@ static int cik_sdma_start(struct amdgpu_device *adev)
 
/* halt the engine before programing */
cik_sdma_enable(adev, false);
+   /* enable sdma ring preemption */
+   cik_ctx_switch_enable(adev, true);
 
/* start the gfx rings and rlc compute queues */
r = cik_sdma_gfx_resume(adev);
@@ -984,6 +1013,7 @@ static int cik_sdma_hw_fini(void *handle)
 {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   cik_ctx_switch_enable(adev, false);
cik_sdma_enable(adev, false);
 
return 0;
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 06/12] drm/amdgpu: Correctly establish the suspend/resume hook for amdkfd

2017-07-03 Thread Felix Kuehling
From: Yong Zhao 

Signed-off-by: Yong Zhao 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++
 drivers/gpu/drm/amd/amdgpu/cik.c   | 9 +
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 5b1220f..bc69b9c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -56,6 +56,8 @@
 #include 
 #include "amdgpu_vf_error.h"
 
+#include "amdgpu_amdkfd.h"
+
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
 
@@ -2397,6 +2399,8 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
suspend, bool fbcon)
drm_modeset_unlock_all(dev);
}
 
+   amdgpu_amdkfd_suspend(adev);
+
/* unpin the front buffers and cursors */
list_for_each_entry(crtc, >mode_config.crtc_list, head) {
struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
@@ -2537,6 +2541,9 @@ int amdgpu_device_resume(struct drm_device *dev, bool 
resume, bool fbcon)
}
}
}
+   r = amdgpu_amdkfd_resume(adev);
+   if (r)
+   return r;
 
/* blat the mode back in */
if (fbcon) {
diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c b/drivers/gpu/drm/amd/amdgpu/cik.c
index 6ce9f80..00639bf 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik.c
@@ -1825,21 +1825,14 @@ static int cik_common_suspend(void *handle)
 {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-   amdgpu_amdkfd_suspend(adev);
-
return cik_common_hw_fini(adev);
 }
 
 static int cik_common_resume(void *handle)
 {
-   int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-   r = cik_common_hw_init(adev);
-   if (r)
-   return r;
-
-   return amdgpu_amdkfd_resume(adev);
+   return cik_common_hw_init(adev);
 }
 
 static bool cik_common_is_idle(void *handle)
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 04/12] drm/amdgpu: Make SDMA phase quantum configurable

2017-07-03 Thread Felix Kuehling
Set a configurable SDMA phase quantum when enabling SDMA context
switching. The default value significantly reduces SDMA latency
in page table updates when user-mode SDMA queues have concurrent
activity, compared to the initial HW setting.

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  4 
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c   | 32 ++-
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c  | 32 ++-
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c  | 34 -
 5 files changed, 100 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 810796a..2129fbb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -106,6 +106,7 @@
 extern unsigned amdgpu_pcie_lane_cap;
 extern unsigned amdgpu_cg_mask;
 extern unsigned amdgpu_pg_mask;
+extern unsigned amdgpu_sdma_phase_quantum;
 extern char *amdgpu_disable_cu;
 extern char *amdgpu_virtual_display;
 extern unsigned amdgpu_pp_feature_mask;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 4bf4a80..02cf24e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -107,6 +107,7 @@
 unsigned amdgpu_pcie_lane_cap = 0;
 unsigned amdgpu_cg_mask = 0x;
 unsigned amdgpu_pg_mask = 0x;
+unsigned amdgpu_sdma_phase_quantum = 32;
 char *amdgpu_disable_cu = NULL;
 char *amdgpu_virtual_display = NULL;
 unsigned amdgpu_pp_feature_mask = 0x;
@@ -223,6 +224,9 @@
 MODULE_PARM_DESC(pg_mask, "Powergating flags mask (0 = disable power gating)");
 module_param_named(pg_mask, amdgpu_pg_mask, uint, 0444);
 
+MODULE_PARM_DESC(sdma_phase_quantum, "SDMA context switch phase quantum (x 1K 
GPU clock cycles, 0 = no change (default 32))");
+module_param_named(sdma_phase_quantum, amdgpu_sdma_phase_quantum, uint, 0444);
+
 MODULE_PARM_DESC(disable_cu, "Disable CUs (se.sh.cu,...)");
 module_param_named(disable_cu, amdgpu_disable_cu, charp, 0444);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index 4a9cea0..f508f4d 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -351,14 +351,44 @@ static void cik_sdma_rlc_stop(struct amdgpu_device *adev)
  */
 static void cik_ctx_switch_enable(struct amdgpu_device *adev, bool enable)
 {
-   u32 f32_cntl;
+   u32 f32_cntl, phase_quantum = 0;
int i;
 
+   if (amdgpu_sdma_phase_quantum) {
+   unsigned value = amdgpu_sdma_phase_quantum;
+   unsigned unit = 0;
+
+   while (value > (SDMA0_PHASE0_QUANTUM__VALUE_MASK >>
+   SDMA0_PHASE0_QUANTUM__VALUE__SHIFT)) {
+   value = (value + 1) >> 1;
+   unit++;
+   }
+   if (unit > (SDMA0_PHASE0_QUANTUM__UNIT_MASK >>
+   SDMA0_PHASE0_QUANTUM__UNIT__SHIFT)) {
+   value = (SDMA0_PHASE0_QUANTUM__VALUE_MASK >>
+SDMA0_PHASE0_QUANTUM__VALUE__SHIFT);
+   unit = (SDMA0_PHASE0_QUANTUM__UNIT_MASK >>
+   SDMA0_PHASE0_QUANTUM__UNIT__SHIFT);
+   WARN_ONCE(1,
+   "clamping sdma_phase_quantum to %uK clock cycles\n",
+ value << unit);
+   }
+   phase_quantum =
+   value << SDMA0_PHASE0_QUANTUM__VALUE__SHIFT |
+   unit  << SDMA0_PHASE0_QUANTUM__UNIT__SHIFT;
+   }
+
for (i = 0; i < adev->sdma.num_instances; i++) {
f32_cntl = RREG32(mmSDMA0_CNTL + sdma_offsets[i]);
if (enable) {
f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
AUTO_CTXSW_ENABLE, 1);
+   if (amdgpu_sdma_phase_quantum) {
+   WREG32(mmSDMA0_PHASE0_QUANTUM + sdma_offsets[i],
+  phase_quantum);
+   WREG32(mmSDMA0_PHASE1_QUANTUM + sdma_offsets[i],
+  phase_quantum);
+   }
} else {
f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
AUTO_CTXSW_ENABLE, 0);
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
index 67a29fb..b1de44f 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@@ -551,9 +551,33 @@ static void sdma_v3_0_rlc_stop(struct amdgpu_device *adev)
  */
 static void sdma_v3_0_ctx_switch_enable(struct amdgpu_device *adev, bool 
enable)
 {
-   u32 f32_cntl;
+   

[PATCH 12/12] drm/amdgpu: enable foreign DMA-buf objects

2017-07-03 Thread Felix Kuehling
From: Christian König 

We should be able to handle BOs from other instances as well.

Signed-off-by: Christian König 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 59 +++
 3 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index f3d99cb..18b2c28 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -468,6 +468,8 @@ struct drm_gem_object *
 struct dma_buf *amdgpu_gem_prime_export(struct drm_device *dev,
struct drm_gem_object *gobj,
int flags);
+struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev,
+  struct dma_buf *dma_buf);
 int amdgpu_gem_prime_pin(struct drm_gem_object *obj);
 void amdgpu_gem_prime_unpin(struct drm_gem_object *obj);
 struct reservation_object *amdgpu_gem_prime_res_obj(struct drm_gem_object *);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 02cf24e..df78a3a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -813,7 +813,7 @@ long amdgpu_drm_ioctl(struct file *filp,
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
.gem_prime_export = amdgpu_gem_prime_export,
-   .gem_prime_import = drm_gem_prime_import,
+   .gem_prime_import = amdgpu_gem_prime_import,
.gem_prime_pin = amdgpu_gem_prime_pin,
.gem_prime_unpin = amdgpu_gem_prime_unpin,
.gem_prime_res_obj = amdgpu_gem_prime_res_obj,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
index b9425ed..9f7fae8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
@@ -159,3 +159,62 @@ struct dma_buf *amdgpu_gem_prime_export(struct drm_device 
*dev,
 
return drm_gem_prime_export(dev, gobj, flags);
 }
+
+static struct drm_gem_object *
+amdgpu_gem_prime_foreign_bo(struct amdgpu_device *adev, struct amdgpu_bo *bo)
+{
+   struct amdgpu_gem_object *gobj;
+   int r;
+
+   ww_mutex_lock(>tbo.resv->lock, NULL);
+
+   list_for_each_entry(gobj, >gem_objects, list) {
+   if (gobj->base.dev != adev->ddev)
+   continue;
+
+   ww_mutex_unlock(>tbo.resv->lock);
+   drm_gem_object_reference(>base);
+   return >base;
+   }
+
+
+   gobj = kzalloc(sizeof(struct amdgpu_gem_object), GFP_KERNEL);
+   if (unlikely(!gobj)) {
+   ww_mutex_unlock(>tbo.resv->lock);
+   return ERR_PTR(-ENOMEM);
+   }
+
+   r = drm_gem_object_init(adev->ddev, >base, amdgpu_bo_size(bo));
+   if (unlikely(r)) {
+   kfree(gobj);
+   ww_mutex_unlock(>tbo.resv->lock);
+   return ERR_PTR(r);
+   }
+
+   list_add(>list, >gem_objects);
+   gobj->bo = amdgpu_bo_ref(bo);
+   bo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
+
+   ww_mutex_unlock(>tbo.resv->lock);
+
+   return >base;
+}
+
+struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev,
+  struct dma_buf *dma_buf)
+{
+   struct amdgpu_device *adev = dev->dev_private;
+
+   if (dma_buf->ops == _gem_prime_dmabuf_ops) {
+   struct drm_gem_object *obj = dma_buf->priv;
+
+   if (obj->dev != dev && obj->dev->driver == dev->driver) {
+   /* It's a amdgpu_bo from a different driver instance */
+   struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
+
+   return amdgpu_gem_prime_foreign_bo(adev, bo);
+   }
+   }
+
+   return drm_gem_prime_import(dev, dma_buf);
+}
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 02/12] drm/amdgpu: Enable SDMA_CNTL.ATC_L1_ENABLE for SDMA on CZ

2017-07-03 Thread Felix Kuehling
From: shaoyunl 

For GFX context, the  ATC bit in SDMA*_GFX_VIRTUAL_ADDRESS  can be cleared
to perform in VM mode. For RLC context, to support ATC mode , ATC bit in
SDMA*_RLC*_VIRTUAL_ADDRESS should be set. SDMA_CNTL.ATC_L1_ENABLE bit is
global setting that enables the  L1-L2 translation for ATC address.

Signed-off-by: shaoyun liu 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
index 1d766ae..67a29fb 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@@ -556,12 +556,18 @@ static void sdma_v3_0_ctx_switch_enable(struct 
amdgpu_device *adev, bool enable)
 
for (i = 0; i < adev->sdma.num_instances; i++) {
f32_cntl = RREG32(mmSDMA0_CNTL + sdma_offsets[i]);
-   if (enable)
+   if (enable) {
f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
AUTO_CTXSW_ENABLE, 1);
-   else
+   f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
+   ATC_L1_ENABLE, 1);
+   } else {
f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
AUTO_CTXSW_ENABLE, 0);
+   f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
+   ATC_L1_ENABLE, 1);
+   }
+
WREG32(mmSDMA0_CNTL + sdma_offsets[i], f32_cntl);
}
 }
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 11/12] drm/amdgpu: handle foreign BOs in the VM mapping

2017-07-03 Thread Felix Kuehling
From: Amber Lin 

Set the system bit for foreign BO mappings and use the remote VRAM
BAR address as the VRAM base offset.

Signed-off-by: Amber Lin 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 1d1810d..5f08e81 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1582,6 +1582,7 @@ static int amdgpu_vm_bo_split_mapping(struct 
amdgpu_device *adev,
  dma_addr_t *pages_addr,
  struct amdgpu_vm *vm,
  struct amdgpu_bo_va_mapping *mapping,
+ uint64_t vram_base_offset,
  uint64_t flags,
  struct drm_mm_node *nodes,
  struct dma_fence **fence)
@@ -1640,7 +1641,7 @@ static int amdgpu_vm_bo_split_mapping(struct 
amdgpu_device *adev,
max_entries = min(max_entries, 16ull * 1024ull);
addr = 0;
} else if (flags & AMDGPU_PTE_VALID) {
-   addr += adev->vm_manager.vram_base_offset;
+   addr += vram_base_offset;
}
addr += pfn << PAGE_SHIFT;
 
@@ -1685,6 +1686,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev,
struct ttm_mem_reg *mem;
struct drm_mm_node *nodes;
struct dma_fence *exclusive;
+   uint64_t vram_base_offset = adev->vm_manager.vram_base_offset;
+   struct amdgpu_device *bo_adev;
int r;
 
if (clear || !bo_va->bo) {
@@ -1706,9 +1709,15 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev,
 
if (bo_va->bo) {
flags = amdgpu_ttm_tt_pte_flags(adev, bo_va->bo->tbo.ttm, mem);
+   bo_adev = amdgpu_ttm_adev(bo_va->bo->tbo.bdev);
gtt_flags = (amdgpu_ttm_is_bound(bo_va->bo->tbo.ttm) &&
-   adev == amdgpu_ttm_adev(bo_va->bo->tbo.bdev)) ?
+   adev == bo_adev) ?
flags : 0;
+   if (mem && mem->mem_type == TTM_PL_VRAM &&
+   adev != bo_adev) {
+   flags |= AMDGPU_PTE_SYSTEM;
+   vram_base_offset = bo_adev->mc.aper_base;
+   }
} else {
flags = 0x0;
gtt_flags = ~0x0;
@@ -1722,8 +1731,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev,
list_for_each_entry(mapping, _va->invalids, list) {
r = amdgpu_vm_bo_split_mapping(adev, exclusive,
   gtt_flags, pages_addr, vm,
-  mapping, flags, nodes,
-  _va->last_pt_update);
+  mapping, vram_base_offset, flags,
+  nodes, _va->last_pt_update);
if (r)
return r;
}
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: check scratch registers to see if we need post (v2)

2017-07-03 Thread Alex Deucher
Rather than checking the CONGIG_MEMSIZE register as that may
not be reliable on some APUs.

v2: The scratch register is only used on CIK+

Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 63f4bed..8042a8a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -722,7 +722,12 @@ bool amdgpu_need_post(struct amdgpu_device *adev)
adev->has_hw_reset = false;
return true;
}
-   /* then check MEM_SIZE, in case the crtcs are off */
+
+   /* bios scratch used on CIK+ */
+   if (adev->asic_type >= CHIP_BONAIRE)
+   return amdgpu_atombios_scratch_need_asic_init(adev);
+
+   /* check MEM_SIZE for older asics */
reg = amdgpu_asic_get_config_memsize(adev);
 
if ((reg != 0) && (reg != 0x))
-- 
2.5.5

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v2 1/2] drm/ttm: Fix use-after-free in ttm_bo_clean_mm

2017-07-03 Thread Christian König

I've gone ahead and pushed the two to our amd-staging-4.11 branch.

Alex will certainly pick that up for his next fixes pull request.

Thanks for the help,
Christian.

Am 03.07.2017 um 20:05 schrieb John Brooks:

We unref the man->move fence in ttm_bo_clean_mm() and then call
ttm_bo_force_list_clean() which waits on it, except the refcount is now
zero so a warning is generated (or worse):

[149492.279301] refcount_t: increment on 0; use-after-free.
[149492.279309] [ cut here ]
[149492.279315] WARNING: CPU: 3 PID: 18726 at lib/refcount.c:150 
refcount_inc+0x2b/0x30
[149492.279315] Modules linked in: vhost_net vhost tun x86_pkg_temp_thermal 
crc32_pclmul ghash_clmulni_intel efivarfs amdgpu(
-) i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops 
ttm drm
[149492.279326] CPU: 3 PID: 18726 Comm: rmmod Not tainted 
4.12.0-rc5-drm-next-4.13-ttmpatch+ #1
[149492.279326] Hardware name: Gigabyte Technology Co., Ltd. 
Z97X-UD3H-BK/Z97X-UD3H-BK-CF, BIOS F6 06/17/2014
[149492.279327] task: 8804ddfedcc0 task.stack: c90008d2
[149492.279329] RIP: 0010:refcount_inc+0x2b/0x30
[149492.279330] RSP: 0018:c90008d23c30 EFLAGS: 00010286
[149492.279331] RAX: 002b RBX: 0170 RCX: 

[149492.279331] RDX:  RSI: 88051ecccbe8 RDI: 
88051ecccbe8
[149492.279332] RBP: c90008d23c30 R08: 0001 R09: 
03ee
[149492.279333] R10: c90008d23bb0 R11: 03ee R12: 
88043aaac960
[149492.279333] R13: 8805005e28a8 R14: 0002 R15: 
88050115e178
[149492.279334] FS:  7fc540168700() GS:88051ecc() 
knlGS:
[149492.279335] CS:  0010 DS:  ES:  CR0: 80050033
[149492.279336] CR2: 7fc3e8654140 CR3: 00027ba77000 CR4: 
001426e0
[149492.279337] DR0:  DR1:  DR2: 

[149492.279337] DR3:  DR6: fffe0ff0 DR7: 
0400
[149492.279338] Call Trace:
[149492.279345]  ttm_bo_force_list_clean+0xb9/0x110 [ttm]
[149492.279348]  ttm_bo_clean_mm+0x7a/0xe0 [ttm]
[149492.279375]  amdgpu_ttm_fini+0xc9/0x1f0 [amdgpu]
[149492.279392]  amdgpu_bo_fini+0x12/0x40 [amdgpu]
[149492.279415]  gmc_v7_0_sw_fini+0x32/0x40 [amdgpu]
[149492.279430]  amdgpu_fini+0x2c9/0x490 [amdgpu]
[149492.279445]  amdgpu_device_fini+0x58/0x1b0 [amdgpu]
[149492.279461]  amdgpu_driver_unload_kms+0x4f/0xa0 [amdgpu]
[149492.279470]  drm_dev_unregister+0x3c/0xe0 [drm]
[149492.279485]  amdgpu_pci_remove+0x19/0x30 [amdgpu]
[149492.279487]  pci_device_remove+0x39/0xc0
[149492.279490]  device_release_driver_internal+0x155/0x210
[149492.279491]  driver_detach+0x38/0x70
[149492.279493]  bus_remove_driver+0x4c/0xa0
[149492.279494]  driver_unregister+0x2c/0x40
[149492.279496]  pci_unregister_driver+0x21/0x90
[149492.279520]  amdgpu_exit+0x15/0x406 [amdgpu]
[149492.279523]  SyS_delete_module+0x1a8/0x270
[149492.279525]  ? exit_to_usermode_loop+0x92/0xa0
[149492.279528]  entry_SYSCALL_64_fastpath+0x13/0x94
[149492.279529] RIP: 0033:0x7fc53fcb68e7
[149492.279529] RSP: 002b:7ffcfbfaabb8 EFLAGS: 0206 ORIG_RAX: 
00b0
[149492.279531] RAX: ffda RBX: 563117adb200 RCX: 
7fc53fcb68e7
[149492.279531] RDX: 000a RSI: 0800 RDI: 
563117adb268
[149492.279532] RBP: 0003 R08:  R09: 
1999
[149492.279533] R10: 0883 R11: 0206 R12: 
7ffcfbfa9ba0
[149492.279533] R13:  R14:  R15: 
563117adb200
[149492.279534] Code: 55 48 89 e5 e8 77 fe ff ff 84 c0 74 02 5d c3 80 3d 40 f2 a4 00 
00 75 f5 48 c7 c7 20 3c ca 81 c6 05 30 f2 a4 00 01 e8 91 f0 d7 ff <0f> ff 5d c3 
90 55 48 89 fe bf 01 00 00 00 48 89 e5 e8 9f fe ff
[149492.279557] ---[ end trace 2d4e0ffcb66a1016 ]---

Unref the fence *after* waiting for it.

v2: Set man->move to NULL after dropping the last ref (Christian König)

Fixes: aff98ba1fdb8 (drm/ttm: wait for eviction in ttm_bo_force_list_clean)
Signed-off-by: John Brooks 
Reviewed-by: Christian König 
Reviewed-by: Alex Deucher 
---
  drivers/gpu/drm/ttm/ttm_bo.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index a6d7fcb..22b5702 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1353,7 +1353,6 @@ int ttm_bo_clean_mm(struct ttm_bo_device *bdev, unsigned 
mem_type)
   mem_type);
return ret;
}
-   dma_fence_put(man->move);
  
  	man->use_type = false;

man->has_type = false;
@@ -1369,6 +1368,9 @@ int ttm_bo_clean_mm(struct ttm_bo_device *bdev, unsigned 
mem_type)
ret = (*man->func->takedown)(man);
}
  
+	dma_fence_put(man->move);

+   man->move = NULL;
+
return ret;
  }
  

[PATCH v2 2/2] drm/amdgpu: Don't call amd_powerplay_destroy() if we don't have powerplay

2017-07-03 Thread John Brooks
amd_powerplay_destroy() expects a handle pointing to a struct pp_instance.
On chips without PowerPlay, pp_handle points to a struct amdgpu_device. The
resulting attempt to kfree() fields of the wrong struct ends in fire:

[   91.560405] BUG: unable to handle kernel paging request at ebe00620
[   91.560414] IP: kfree+0x57/0x160
[   91.560416] PGD 0
[   91.560416] P4D 0

[   91.560420] Oops:  [#1] SMP
[   91.560422] Modules linked in: tun x86_pkg_temp_thermal crc32_pclmul 
ghash_clmulni_intel efivarfs amdgpu(-) i2c_algo_bit drm_kms_helper syscopyarea 
sysfillrect sysimgblt fb_sys_fops ttm drm
[   91.560438] CPU: 6 PID: 3598 Comm: rmmod Not tainted 
4.12.0-rc5-drm-next-4.13-ttmpatch+ #1
[   91.560443] Hardware name: Gigabyte Technology Co., Ltd. 
Z97X-UD3H-BK/Z97X-UD3H-BK-CF, BIOS F6 06/17/2014
[   91.560448] task: 8805063d6a00 task.stack: c9000340
[   91.560451] RIP: 0010:kfree+0x57/0x160
[   91.560454] RSP: 0018:c90003403cc0 EFLAGS: 00010286
[   91.560457] RAX: 77ff8000 RBX: 000186a0 RCX: 000180400035
[   91.560460] RDX: 000180400036 RSI: ea001418e740 RDI: ea00
[   91.560463] RBP: c90003403cd8 R08: 0639d201 R09: 000180400035
[   91.560467] R10: ebe00600 R11: 0300 R12: 880500530030
[   91.560470] R13: a01e70fc R14:  R15: 88050053
[   91.560473] FS:  7f7e500c3700() GS:88051ed8() 
knlGS:
[   91.560478] CS:  0010 DS:  ES:  CR0: 80050033
[   91.560480] CR2: ebe00620 CR3: 000503103000 CR4: 001406e0
[   91.560483] DR0:  DR1:  DR2: 
[   91.560487] DR3:  DR6: fffe0ff0 DR7: 0400
[   91.560489] Call Trace:
[   91.560530]  amd_powerplay_destroy+0x1c/0x60 [amdgpu]
[   91.560558]  amdgpu_pp_late_fini+0x44/0x60 [amdgpu]
[   91.560575]  amdgpu_fini+0x254/0x490 [amdgpu]
[   91.560593]  amdgpu_device_fini+0x58/0x1b0 [amdgpu]
[   91.560610]  amdgpu_driver_unload_kms+0x4f/0xa0 [amdgpu]
[   91.560622]  drm_dev_unregister+0x3c/0xe0 [drm]
[   91.560638]  amdgpu_pci_remove+0x19/0x30 [amdgpu]
[   91.560643]  pci_device_remove+0x39/0xc0
[   91.560648]  device_release_driver_internal+0x155/0x210
[   91.560651]  driver_detach+0x38/0x70
[   91.560655]  bus_remove_driver+0x4c/0xa0
[   91.560658]  driver_unregister+0x2c/0x40
[   91.560662]  pci_unregister_driver+0x21/0x90
[   91.560689]  amdgpu_exit+0x15/0x406 [amdgpu]
[   91.560694]  SyS_delete_module+0x1a8/0x270
[   91.560698]  ? exit_to_usermode_loop+0x92/0xa0
[   91.560702]  entry_SYSCALL_64_fastpath+0x13/0x94
[   91.560705] RIP: 0033:0x7f7e4fc118e7
[   91.560708] RSP: 002b:7fff978ca118 EFLAGS: 0206 ORIG_RAX: 
00b0
[   91.560713] RAX: ffda RBX: 55afe21bc200 RCX: 7f7e4fc118e7
[   91.560716] RDX: 000a RSI: 0800 RDI: 55afe21bc268
[   91.560719] RBP: 0003 R08:  R09: 1999
[   91.560722] R10: 0883 R11: 0206 R12: 7fff978c9100
[   91.560725] R13:  R14:  R15: 55afe21bc200
[   91.560728] Code: 00 00 00 80 ff 77 00 00 48 bf 00 00 00 00 00 ea ff ff 49 
01 da 48 0f 42 05 57 33 bd 00 49 01 c2 49 c1 ea 0c 49 c1 e2 06 49 01 fa <49> 8b 
42 20 48 8d 78 ff a8 01 4c 0f 45 d7 49 8b 52 20 48 8d 42
[   91.560759] RIP: kfree+0x57/0x160 RSP: c90003403cc0
[   91.560761] CR2: ebe00620
[   91.560765] ---[ end trace 08a9f3cd82223c1d ]---

Fixes: 1c8638024846 (drm/amd/powerplay: refine powerplay interface.)
Signed-off-by: John Brooks 
Acked-by: Christian König 
Reviewed-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_powerplay.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_powerplay.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_powerplay.c
index 72c03c7..93ffb85 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_powerplay.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_powerplay.c
@@ -209,7 +209,8 @@ static void amdgpu_pp_late_fini(void *handle)
if (adev->pp_enabled && adev->pm.dpm_enabled)
amdgpu_pm_sysfs_fini(adev);
 
-   amd_powerplay_destroy(adev->powerplay.pp_handle);
+   if (adev->pp_enabled)
+   amd_powerplay_destroy(adev->powerplay.pp_handle);
 }
 
 static int amdgpu_pp_suspend(void *handle)
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v2 1/2] drm/ttm: Fix use-after-free in ttm_bo_clean_mm

2017-07-03 Thread John Brooks
We unref the man->move fence in ttm_bo_clean_mm() and then call
ttm_bo_force_list_clean() which waits on it, except the refcount is now
zero so a warning is generated (or worse):

[149492.279301] refcount_t: increment on 0; use-after-free.
[149492.279309] [ cut here ]
[149492.279315] WARNING: CPU: 3 PID: 18726 at lib/refcount.c:150 
refcount_inc+0x2b/0x30
[149492.279315] Modules linked in: vhost_net vhost tun x86_pkg_temp_thermal 
crc32_pclmul ghash_clmulni_intel efivarfs amdgpu(
-) i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops 
ttm drm
[149492.279326] CPU: 3 PID: 18726 Comm: rmmod Not tainted 
4.12.0-rc5-drm-next-4.13-ttmpatch+ #1
[149492.279326] Hardware name: Gigabyte Technology Co., Ltd. 
Z97X-UD3H-BK/Z97X-UD3H-BK-CF, BIOS F6 06/17/2014
[149492.279327] task: 8804ddfedcc0 task.stack: c90008d2
[149492.279329] RIP: 0010:refcount_inc+0x2b/0x30
[149492.279330] RSP: 0018:c90008d23c30 EFLAGS: 00010286
[149492.279331] RAX: 002b RBX: 0170 RCX: 

[149492.279331] RDX:  RSI: 88051ecccbe8 RDI: 
88051ecccbe8
[149492.279332] RBP: c90008d23c30 R08: 0001 R09: 
03ee
[149492.279333] R10: c90008d23bb0 R11: 03ee R12: 
88043aaac960
[149492.279333] R13: 8805005e28a8 R14: 0002 R15: 
88050115e178
[149492.279334] FS:  7fc540168700() GS:88051ecc() 
knlGS:
[149492.279335] CS:  0010 DS:  ES:  CR0: 80050033
[149492.279336] CR2: 7fc3e8654140 CR3: 00027ba77000 CR4: 
001426e0
[149492.279337] DR0:  DR1:  DR2: 

[149492.279337] DR3:  DR6: fffe0ff0 DR7: 
0400
[149492.279338] Call Trace:
[149492.279345]  ttm_bo_force_list_clean+0xb9/0x110 [ttm]
[149492.279348]  ttm_bo_clean_mm+0x7a/0xe0 [ttm]
[149492.279375]  amdgpu_ttm_fini+0xc9/0x1f0 [amdgpu]
[149492.279392]  amdgpu_bo_fini+0x12/0x40 [amdgpu]
[149492.279415]  gmc_v7_0_sw_fini+0x32/0x40 [amdgpu]
[149492.279430]  amdgpu_fini+0x2c9/0x490 [amdgpu]
[149492.279445]  amdgpu_device_fini+0x58/0x1b0 [amdgpu]
[149492.279461]  amdgpu_driver_unload_kms+0x4f/0xa0 [amdgpu]
[149492.279470]  drm_dev_unregister+0x3c/0xe0 [drm]
[149492.279485]  amdgpu_pci_remove+0x19/0x30 [amdgpu]
[149492.279487]  pci_device_remove+0x39/0xc0
[149492.279490]  device_release_driver_internal+0x155/0x210
[149492.279491]  driver_detach+0x38/0x70
[149492.279493]  bus_remove_driver+0x4c/0xa0
[149492.279494]  driver_unregister+0x2c/0x40
[149492.279496]  pci_unregister_driver+0x21/0x90
[149492.279520]  amdgpu_exit+0x15/0x406 [amdgpu]
[149492.279523]  SyS_delete_module+0x1a8/0x270
[149492.279525]  ? exit_to_usermode_loop+0x92/0xa0
[149492.279528]  entry_SYSCALL_64_fastpath+0x13/0x94
[149492.279529] RIP: 0033:0x7fc53fcb68e7
[149492.279529] RSP: 002b:7ffcfbfaabb8 EFLAGS: 0206 ORIG_RAX: 
00b0
[149492.279531] RAX: ffda RBX: 563117adb200 RCX: 
7fc53fcb68e7
[149492.279531] RDX: 000a RSI: 0800 RDI: 
563117adb268
[149492.279532] RBP: 0003 R08:  R09: 
1999
[149492.279533] R10: 0883 R11: 0206 R12: 
7ffcfbfa9ba0
[149492.279533] R13:  R14:  R15: 
563117adb200
[149492.279534] Code: 55 48 89 e5 e8 77 fe ff ff 84 c0 74 02 5d c3 80 3d 40 f2 
a4 00 00 75 f5 48 c7 c7 20 3c ca 81 c6 05 30 f2 a4 00 01 e8 91 f0 d7 ff <0f> ff 
5d c3 90 55 48 89 fe bf 01 00 00 00 48 89 e5 e8 9f fe ff
[149492.279557] ---[ end trace 2d4e0ffcb66a1016 ]---

Unref the fence *after* waiting for it.

v2: Set man->move to NULL after dropping the last ref (Christian König)

Fixes: aff98ba1fdb8 (drm/ttm: wait for eviction in ttm_bo_force_list_clean)
Signed-off-by: John Brooks 
Reviewed-by: Christian König 
Reviewed-by: Alex Deucher 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index a6d7fcb..22b5702 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1353,7 +1353,6 @@ int ttm_bo_clean_mm(struct ttm_bo_device *bdev, unsigned 
mem_type)
   mem_type);
return ret;
}
-   dma_fence_put(man->move);
 
man->use_type = false;
man->has_type = false;
@@ -1369,6 +1368,9 @@ int ttm_bo_clean_mm(struct ttm_bo_device *bdev, unsigned 
mem_type)
ret = (*man->func->takedown)(man);
}
 
+   dma_fence_put(man->move);
+   man->move = NULL;
+
return ret;
 }
 EXPORT_SYMBOL(ttm_bo_clean_mm);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: trace VM flags as 64bits

2017-07-03 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Christian König
> Sent: Monday, July 03, 2017 9:25 AM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH] drm/amdgpu: trace VM flags as 64bits
> 
> From: Christian König 
> 
> Otherwise the upper bits are lost.
> 
> Signed-off-by: Christian König 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 18 +-
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> index 8601904..509f7a6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> @@ -224,7 +224,7 @@ TRACE_EVENT(amdgpu_vm_bo_map,
>__field(long, start)
>__field(long, last)
>__field(u64, offset)
> -  __field(u32, flags)
> +  __field(u64, flags)
>),
> 
>   TP_fast_assign(
> @@ -234,7 +234,7 @@ TRACE_EVENT(amdgpu_vm_bo_map,
>  __entry->offset = mapping->offset;
>  __entry->flags = mapping->flags;
>  ),
> - TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx,
> flags=%08x",
> + TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%llx",
> __entry->bo, __entry->start, __entry->last,
> __entry->offset, __entry->flags)
>  );
> @@ -248,7 +248,7 @@ TRACE_EVENT(amdgpu_vm_bo_unmap,
>__field(long, start)
>__field(long, last)
>__field(u64, offset)
> -  __field(u32, flags)
> +  __field(u64, flags)
>),
> 
>   TP_fast_assign(
> @@ -258,7 +258,7 @@ TRACE_EVENT(amdgpu_vm_bo_unmap,
>  __entry->offset = mapping->offset;
>  __entry->flags = mapping->flags;
>  ),
> - TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx,
> flags=%08x",
> + TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%llx",
> __entry->bo, __entry->start, __entry->last,
> __entry->offset, __entry->flags)
>  );
> @@ -269,7 +269,7 @@ DECLARE_EVENT_CLASS(amdgpu_vm_mapping,
>   TP_STRUCT__entry(
>__field(u64, soffset)
>__field(u64, eoffset)
> -  __field(u32, flags)
> +  __field(u64, flags)
>),
> 
>   TP_fast_assign(
> @@ -277,7 +277,7 @@ DECLARE_EVENT_CLASS(amdgpu_vm_mapping,
>  __entry->eoffset = mapping->last + 1;
>  __entry->flags = mapping->flags;
>  ),
> - TP_printk("soffs=%010llx, eoffs=%010llx, flags=%08x",
> + TP_printk("soffs=%010llx, eoffs=%010llx, flags=%llx",
> __entry->soffset, __entry->eoffset, __entry->flags)
>  );
> 
> @@ -293,14 +293,14 @@ DEFINE_EVENT(amdgpu_vm_mapping,
> amdgpu_vm_bo_mapping,
> 
>  TRACE_EVENT(amdgpu_vm_set_ptes,
>   TP_PROTO(uint64_t pe, uint64_t addr, unsigned count,
> -  uint32_t incr, uint32_t flags),
> +  uint32_t incr, uint64_t flags),
>   TP_ARGS(pe, addr, count, incr, flags),
>   TP_STRUCT__entry(
>__field(u64, pe)
>__field(u64, addr)
>__field(u32, count)
>__field(u32, incr)
> -  __field(u32, flags)
> +  __field(u64, flags)
>),
> 
>   TP_fast_assign(
> @@ -310,7 +310,7 @@ TRACE_EVENT(amdgpu_vm_set_ptes,
>  __entry->incr = incr;
>  __entry->flags = flags;
>  ),
> - TP_printk("pe=%010Lx, addr=%010Lx, incr=%u, flags=%08x,
> count=%u",
> + TP_printk("pe=%010Lx, addr=%010Lx, incr=%u, flags=%llx,
> count=%u",
> __entry->pe, __entry->addr, __entry->incr,
> __entry->flags, __entry->count)
>  );
> --
> 2.7.4
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: remove stale TODO comment

2017-07-03 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Christian König
> Sent: Monday, July 03, 2017 9:22 AM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH] drm/amdgpu: remove stale TODO comment
> 
> From: Christian König 
> 
> That is already fixed.
> 
> Signed-off-by: Christian König 

I saw that the other day and meant to send a patch as well.
Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index c34cf2c..a85e753 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -951,7 +951,6 @@ int amdgpu_bo_fault_reserve_notify(struct
> ttm_buffer_object *bo)
> 
>   size = bo->mem.num_pages << PAGE_SHIFT;
>   offset = bo->mem.start << PAGE_SHIFT;
> - /* TODO: figure out how to map scattered VRAM to the CPU */
>   if ((offset + size) <= adev->mc.visible_vram_size)
>   return 0;
> 
> --
> 2.7.4
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 3/4] drm/amd/powerplay: move VI common AVFS code to smu7_smumgr.c

2017-07-03 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Rex Zhu
> Sent: Monday, July 03, 2017 6:14 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhu, Rex
> Subject: [PATCH 3/4] drm/amd/powerplay: move VI common AVFS code to
> smu7_smumgr.c
> 
> Change-Id: I2bee3e700281a57ad77132794187ef45d2d79dcd
> Signed-off-by: Rex Zhu 
> ---
>  drivers/gpu/drm/amd/powerplay/inc/smumgr.h |  3 +
>  drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c|  6 +-
>  drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c | 73 ++-
> ---
>  drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.h | 11 
>  .../gpu/drm/amd/powerplay/smumgr/polaris10_smc.c   |  4 +-
>  .../drm/amd/powerplay/smumgr/polaris10_smumgr.c| 29 -
>  .../drm/amd/powerplay/smumgr/polaris10_smumgr.h| 12 +---
>  drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c |  6 +-
>  drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.h |  8 ++-
>  drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c  |  8 +++
>  10 files changed, 74 insertions(+), 86 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
> b/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
> index 976e942..5d61cc9 100644
> --- a/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
> +++ b/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
> @@ -131,6 +131,7 @@ struct pp_smumgr_func {
>   bool (*is_dpm_running)(struct pp_hwmgr *hwmgr);
>   int (*populate_requested_graphic_levels)(struct pp_hwmgr
> *hwmgr,
>   struct amd_pp_profile *request);
> + bool (*is_hw_avfs_present)(struct pp_smumgr *smumgr);
>  };
> 
>  struct pp_smumgr {
> @@ -202,6 +203,8 @@ extern uint32_t smum_get_offsetof(struct
> pp_smumgr *smumgr,
>  extern int smum_populate_requested_graphic_levels(struct pp_hwmgr
> *hwmgr,
>   struct amd_pp_profile *request);
> 
> +extern bool smum_is_hw_avfs_present(struct pp_smumgr *smumgr);
> +
>  #define SMUM_FIELD_SHIFT(reg, field) reg##__##field##__SHIFT
> 
>  #define SMUM_FIELD_MASK(reg, field) reg##__##field##_MASK
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c
> b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c
> index ca24e15..0750530 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c
> @@ -2134,16 +2134,16 @@ int fiji_thermal_avfs_enable(struct pp_hwmgr
> *hwmgr)
>  {
>   int ret;
>   struct pp_smumgr *smumgr = (struct pp_smumgr *)(hwmgr-
> >smumgr);
> - struct fiji_smumgr *smu_data = (struct fiji_smumgr *)(smumgr-
> >backend);
> + struct smu7_smumgr *smu_data = (struct smu7_smumgr
> *)(smumgr->backend);
> 
> - if (smu_data->avfs.AvfsBtcStatus != AVFS_BTC_ENABLEAVFS)
> + if (smu_data->avfs.avfs_btc_param != AVFS_BTC_ENABLEAVFS)
>   return 0;
> 
>   ret = smum_send_msg_to_smc(smumgr,
> PPSMC_MSG_EnableAvfs);
> 
>   if (!ret)
>   /* If this param is not changed, this function could fire
> unnecessarily */
> - smu_data->avfs.AvfsBtcStatus =
> AVFS_BTC_COMPLETED_PREVIOUSLY;
> + smu_data->avfs.avfs_btc_param =
> AVFS_BTC_COMPLETED_PREVIOUSLY;
> 
>   return ret;
>  }
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
> b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
> index 719e885..492f682 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
> @@ -161,43 +161,46 @@ static int
> fiji_start_smu_in_non_protection_mode(struct pp_smumgr *smumgr)
> 
>  static int fiji_setup_pwr_virus(struct pp_smumgr *smumgr)
>  {
> - int i, result = -1;
> + int i;
> + int result = -EINVAL;
>   uint32_t reg, data;
> - const PWR_Command_Table *virus = PwrVirusTable;
> - struct fiji_smumgr *priv = (struct fiji_smumgr *)(smumgr->backend);
> 
> - priv->avfs.AvfsBtcStatus = AVFS_LOAD_VIRUS;
> - for (i = 0; (i < PWR_VIRUS_TABLE_SIZE); i++) {
> - switch (virus->command) {
> + const PWR_Command_Table *pvirus = PwrVirusTable;
> + struct smu7_smumgr *smu_data = (struct smu7_smumgr
> *)(smumgr->backend);
> +
> + for (i = 0; i < PWR_VIRUS_TABLE_SIZE; i++) {
> + switch (pvirus->command) {
>   case PwrCmdWrite:
> - reg  = virus->reg;
> - data = virus->data;
> + reg  = pvirus->reg;
> + data = pvirus->data;
>   cgs_write_register(smumgr->device, reg, data);
>   break;
> +
>   case PwrCmdEnd:
> - priv->avfs.AvfsBtcStatus =
> AVFS_BTC_VIRUS_LOADED;
>   result = 0;
>   break;
> +
>   default:
> - pr_err("Table Exit with Invalid Command!");
> - priv->avfs.AvfsBtcStatus = AVFS_BTC_VIRUS_FAIL;
> - result = -1;
> + 

RE: [PATCH 1/4] drm/amd/powerplay: fix avfs state update error on polaris.

2017-07-03 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Rex Zhu
> Sent: Monday, July 03, 2017 6:14 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhu, Rex
> Subject: [PATCH 1/4] drm/amd/powerplay: fix avfs state update error on
> polaris.
> 
> always lead to print avfs broken in dmesg.
> 
> Change-Id: If4b2d9607acf5a600b496204a5cc79a2d867260c
> Signed-off-by: Rex Zhu 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
> b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
> index 9616ced..7e03470 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
> @@ -200,15 +200,16 @@ static int
> polaris10_setup_graphics_level_structure(struct pp_smumgr *smumgr)
>   PP_ASSERT_WITH_CODE(0 ==
> polaris10_perform_btc(smumgr),
>   "[AVFS][Polaris10_AVFSEventMgr]
> Failure at SmuPolaris10_PerformBTC. AVFS Disabled",
>return -1);
> -
> + smu_data->avfs.avfs_btc_status = AVFS_BTC_ENABLEAVFS;
>   break;
> 
>   case AVFS_BTC_DISABLED:
> + case AVFS_BTC_ENABLEAVFS:
>   case AVFS_BTC_NOTSUPPORTED:
>   break;
> 
>   default:
> - pr_info("[AVFS] Something is broken. See log!");
> + pr_err("AVFS failed status is %x!\n", smu_data-
> >avfs.avfs_btc_status);
>   break;
>   }
> 
> --
> 1.9.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 4/4] drm/amd/powerplay: add avfs check for old asics on Vi.

2017-07-03 Thread Deucher, Alexander


> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Rex Zhu
> Sent: Monday, July 03, 2017 6:14 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhu, Rex
> Subject: [PATCH 4/4] drm/amd/powerplay: add avfs check for old asics on Vi.
> 
> Change-Id: I1737ba27ae2a8f5c579ccb541ced9cb979ffd1ff
> Signed-off-by: Rex Zhu 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> index e7ecbd1..8fe62aa 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> @@ -4660,6 +4660,15 @@ static int smu7_set_power_profile_state(struct
> pp_hwmgr *hwmgr,
> 
>  static int smu7_avfs_control(struct pp_hwmgr *hwmgr, bool enable)
>  {
> + struct pp_smumgr *smumgr = (struct pp_smumgr *)(hwmgr-
> >smumgr);
> + struct smu7_smumgr *smu_data = (struct smu7_smumgr
> *)(smumgr->backend);
> +
> + if (smu_data == NULL)
> + return -EINVAL;
> +
> + if (smu_data->avfs.avfs_btc_status == AVFS_BTC_NOTSUPPORTED)
> + return 0;
> +
>   if (enable) {
>   if (!PHM_READ_VFPF_INDIRECT_FIELD(hwmgr->device,
>   CGS_IND_REG__SMC, FEATURE_STATUS,
> AVS_ON))
> --
> 1.9.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: fix vulkan test performance drop and hang on VI

2017-07-03 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Rex Zhu
> Sent: Monday, July 03, 2017 6:13 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhu, Rex
> Subject: [PATCH] drm/amdgpu: fix vulkan test performance drop and hang
> on VI
> 
> caused by not program dynamic_cu_mask_addr in the KIQ MQD.
> 
> v2: create struct vi_mqd_allocation in FB which will contain
> 1. PM4 MQD structure.
> 2. Write Pointer Poll Memory.
> 3. Read Pointer Report Memory
> 4. Dynamic CU Mask.
> 5. Dynamic RB Mask.
> 
> Change-Id: I22c840f1bf8d365f7df33a27d6b11e1aea8f2958
> Signed-off-by: Rex Zhu 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c|  27 ++--
>  drivers/gpu/drm/amd/include/vi_structs.h | 268
> +++
>  2 files changed, 285 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> index 1a75ab1..452cc5b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> @@ -40,7 +40,6 @@
> 
>  #include "bif/bif_5_0_d.h"
>  #include "bif/bif_5_0_sh_mask.h"
> -
>  #include "gca/gfx_8_0_d.h"
>  #include "gca/gfx_8_0_enum.h"
>  #include "gca/gfx_8_0_sh_mask.h"
> @@ -2100,7 +2099,7 @@ static int gfx_v8_0_sw_init(void *handle)
>   return r;
> 
>   /* create MQD for all compute queues as well as KIQ for SRIOV case
> */
> - r = amdgpu_gfx_compute_mqd_sw_init(adev, sizeof(struct
> vi_mqd));
> + r = amdgpu_gfx_compute_mqd_sw_init(adev, sizeof(struct
> vi_mqd_allocation));
>   if (r)
>   return r;
> 
> @@ -4715,9 +4714,6 @@ static int gfx_v8_0_mqd_init(struct amdgpu_ring
> *ring)
>   uint64_t hqd_gpu_addr, wb_gpu_addr, eop_base_addr;
>   uint32_t tmp;
> 
> - /* init the mqd struct */
> - memset(mqd, 0, sizeof(struct vi_mqd));
> -
>   mqd->header = 0xC0310800;
>   mqd->compute_pipelinestat_enable = 0x0001;
>   mqd->compute_static_thread_mgmt_se0 = 0x;
> @@ -4725,7 +4721,12 @@ static int gfx_v8_0_mqd_init(struct amdgpu_ring
> *ring)
>   mqd->compute_static_thread_mgmt_se2 = 0x;
>   mqd->compute_static_thread_mgmt_se3 = 0x;
>   mqd->compute_misc_reserved = 0x0003;
> -
> + if (!(adev->flags & AMD_IS_APU)) {
> + mqd->dynamic_cu_mask_addr_lo = lower_32_bits(ring-
> >mqd_gpu_addr
> +  + offsetof(struct
> vi_mqd_allocation, dyamic_cu_mask));
> + mqd->dynamic_cu_mask_addr_hi = upper_32_bits(ring-
> >mqd_gpu_addr
> +  + offsetof(struct
> vi_mqd_allocation, dyamic_cu_mask));
> + }
>   eop_base_addr = ring->eop_gpu_addr >> 8;
>   mqd->cp_hqd_eop_base_addr_lo = eop_base_addr;
>   mqd->cp_hqd_eop_base_addr_hi =
> upper_32_bits(eop_base_addr);
> @@ -4900,7 +4901,7 @@ static int gfx_v8_0_kiq_init_queue(struct
> amdgpu_ring *ring)
>   if (adev->gfx.in_reset) { /* for GPU_RESET case */
>   /* reset MQD to a clean status */
>   if (adev->gfx.mec.mqd_backup[mqd_idx])
> - memcpy(mqd, adev-
> >gfx.mec.mqd_backup[mqd_idx], sizeof(*mqd));
> + memcpy(mqd, adev-
> >gfx.mec.mqd_backup[mqd_idx], sizeof(struct vi_mqd_allocation));
> 
>   /* reset ring buffer */
>   ring->wptr = 0;
> @@ -4916,6 +4917,9 @@ static int gfx_v8_0_kiq_init_queue(struct
> amdgpu_ring *ring)
>   vi_srbm_select(adev, 0, 0, 0, 0);
>   mutex_unlock(>srbm_mutex);
>   } else {
> + memset((void *)mqd, 0, sizeof(struct vi_mqd_allocation));
> + ((struct vi_mqd_allocation *)mqd)->dyamic_cu_mask =
> 0x;
> + ((struct vi_mqd_allocation *)mqd)->dyamic_rb_mask =
> 0x;
>   mutex_lock(>srbm_mutex);
>   vi_srbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
>   gfx_v8_0_mqd_init(ring);
> @@ -4929,7 +4933,7 @@ static int gfx_v8_0_kiq_init_queue(struct
> amdgpu_ring *ring)
>   mutex_unlock(>srbm_mutex);
> 
>   if (adev->gfx.mec.mqd_backup[mqd_idx])
> - memcpy(adev->gfx.mec.mqd_backup[mqd_idx],
> mqd, sizeof(*mqd));
> + memcpy(adev->gfx.mec.mqd_backup[mqd_idx],
> mqd, sizeof(struct vi_mqd_allocation));
>   }
> 
>   return r;
> @@ -4947,6 +4951,9 @@ static int gfx_v8_0_kcq_init_queue(struct
> amdgpu_ring *ring)
>   int mqd_idx = ring - >gfx.compute_ring[0];
> 
>   if (!adev->gfx.in_reset && !adev->gfx.in_suspend) {
> + memset((void *)mqd, 0, sizeof(struct vi_mqd_allocation));
> + ((struct vi_mqd_allocation *)mqd)->dyamic_cu_mask =
> 0x;
> + ((struct vi_mqd_allocation *)mqd)->dyamic_rb_mask =
> 0x;
>   mutex_lock(>srbm_mutex);
>   vi_srbm_select(adev, 

Re: [PATCH] drm/amdgpu: trace VM flags as 64bits

2017-07-03 Thread axie

Reviewed-by: Alex Xie 


On 2017-07-03 09:25 AM, Christian König wrote:

From: Christian König 

Otherwise the upper bits are lost.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 18 +-
  1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 8601904..509f7a6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -224,7 +224,7 @@ TRACE_EVENT(amdgpu_vm_bo_map,
 __field(long, start)
 __field(long, last)
 __field(u64, offset)
-__field(u32, flags)
+__field(u64, flags)
 ),
  
  	TP_fast_assign(

@@ -234,7 +234,7 @@ TRACE_EVENT(amdgpu_vm_bo_map,
   __entry->offset = mapping->offset;
   __entry->flags = mapping->flags;
   ),
-   TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%08x",
+   TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%llx",
  __entry->bo, __entry->start, __entry->last,
  __entry->offset, __entry->flags)
  );
@@ -248,7 +248,7 @@ TRACE_EVENT(amdgpu_vm_bo_unmap,
 __field(long, start)
 __field(long, last)
 __field(u64, offset)
-__field(u32, flags)
+__field(u64, flags)
 ),
  
  	TP_fast_assign(

@@ -258,7 +258,7 @@ TRACE_EVENT(amdgpu_vm_bo_unmap,
   __entry->offset = mapping->offset;
   __entry->flags = mapping->flags;
   ),
-   TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%08x",
+   TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%llx",
  __entry->bo, __entry->start, __entry->last,
  __entry->offset, __entry->flags)
  );
@@ -269,7 +269,7 @@ DECLARE_EVENT_CLASS(amdgpu_vm_mapping,
TP_STRUCT__entry(
 __field(u64, soffset)
 __field(u64, eoffset)
-__field(u32, flags)
+__field(u64, flags)
 ),
  
  	TP_fast_assign(

@@ -277,7 +277,7 @@ DECLARE_EVENT_CLASS(amdgpu_vm_mapping,
   __entry->eoffset = mapping->last + 1;
   __entry->flags = mapping->flags;
   ),
-   TP_printk("soffs=%010llx, eoffs=%010llx, flags=%08x",
+   TP_printk("soffs=%010llx, eoffs=%010llx, flags=%llx",
  __entry->soffset, __entry->eoffset, __entry->flags)
  );
  
@@ -293,14 +293,14 @@ DEFINE_EVENT(amdgpu_vm_mapping, amdgpu_vm_bo_mapping,
  
  TRACE_EVENT(amdgpu_vm_set_ptes,

TP_PROTO(uint64_t pe, uint64_t addr, unsigned count,
-uint32_t incr, uint32_t flags),
+uint32_t incr, uint64_t flags),
TP_ARGS(pe, addr, count, incr, flags),
TP_STRUCT__entry(
 __field(u64, pe)
 __field(u64, addr)
 __field(u32, count)
 __field(u32, incr)
-__field(u32, flags)
+__field(u64, flags)
 ),
  
  	TP_fast_assign(

@@ -310,7 +310,7 @@ TRACE_EVENT(amdgpu_vm_set_ptes,
   __entry->incr = incr;
   __entry->flags = flags;
   ),
-   TP_printk("pe=%010Lx, addr=%010Lx, incr=%u, flags=%08x, count=%u",
+   TP_printk("pe=%010Lx, addr=%010Lx, incr=%u, flags=%llx, count=%u",
  __entry->pe, __entry->addr, __entry->incr,
  __entry->flags, __entry->count)
  );


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: trace VM flags as 64bits

2017-07-03 Thread Christian König
From: Christian König 

Otherwise the upper bits are lost.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 8601904..509f7a6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -224,7 +224,7 @@ TRACE_EVENT(amdgpu_vm_bo_map,
 __field(long, start)
 __field(long, last)
 __field(u64, offset)
-__field(u32, flags)
+__field(u64, flags)
 ),
 
TP_fast_assign(
@@ -234,7 +234,7 @@ TRACE_EVENT(amdgpu_vm_bo_map,
   __entry->offset = mapping->offset;
   __entry->flags = mapping->flags;
   ),
-   TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%08x",
+   TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%llx",
  __entry->bo, __entry->start, __entry->last,
  __entry->offset, __entry->flags)
 );
@@ -248,7 +248,7 @@ TRACE_EVENT(amdgpu_vm_bo_unmap,
 __field(long, start)
 __field(long, last)
 __field(u64, offset)
-__field(u32, flags)
+__field(u64, flags)
 ),
 
TP_fast_assign(
@@ -258,7 +258,7 @@ TRACE_EVENT(amdgpu_vm_bo_unmap,
   __entry->offset = mapping->offset;
   __entry->flags = mapping->flags;
   ),
-   TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%08x",
+   TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%llx",
  __entry->bo, __entry->start, __entry->last,
  __entry->offset, __entry->flags)
 );
@@ -269,7 +269,7 @@ DECLARE_EVENT_CLASS(amdgpu_vm_mapping,
TP_STRUCT__entry(
 __field(u64, soffset)
 __field(u64, eoffset)
-__field(u32, flags)
+__field(u64, flags)
 ),
 
TP_fast_assign(
@@ -277,7 +277,7 @@ DECLARE_EVENT_CLASS(amdgpu_vm_mapping,
   __entry->eoffset = mapping->last + 1;
   __entry->flags = mapping->flags;
   ),
-   TP_printk("soffs=%010llx, eoffs=%010llx, flags=%08x",
+   TP_printk("soffs=%010llx, eoffs=%010llx, flags=%llx",
  __entry->soffset, __entry->eoffset, __entry->flags)
 );
 
@@ -293,14 +293,14 @@ DEFINE_EVENT(amdgpu_vm_mapping, amdgpu_vm_bo_mapping,
 
 TRACE_EVENT(amdgpu_vm_set_ptes,
TP_PROTO(uint64_t pe, uint64_t addr, unsigned count,
-uint32_t incr, uint32_t flags),
+uint32_t incr, uint64_t flags),
TP_ARGS(pe, addr, count, incr, flags),
TP_STRUCT__entry(
 __field(u64, pe)
 __field(u64, addr)
 __field(u32, count)
 __field(u32, incr)
-__field(u32, flags)
+__field(u64, flags)
 ),
 
TP_fast_assign(
@@ -310,7 +310,7 @@ TRACE_EVENT(amdgpu_vm_set_ptes,
   __entry->incr = incr;
   __entry->flags = flags;
   ),
-   TP_printk("pe=%010Lx, addr=%010Lx, incr=%u, flags=%08x, count=%u",
+   TP_printk("pe=%010Lx, addr=%010Lx, incr=%u, flags=%llx, count=%u",
  __entry->pe, __entry->addr, __entry->incr,
  __entry->flags, __entry->count)
 );
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: remove stale TODO comment

2017-07-03 Thread Christian König
From: Christian König 

That is already fixed.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index c34cf2c..a85e753 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -951,7 +951,6 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object 
*bo)
 
size = bo->mem.num_pages << PAGE_SHIFT;
offset = bo->mem.start << PAGE_SHIFT;
-   /* TODO: figure out how to map scattered VRAM to the CPU */
if ((offset + size) <= adev->mc.visible_vram_size)
return 0;
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Deprecation of AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED

2017-07-03 Thread Marek Olšák
On Mon, Jul 3, 2017 at 12:08 PM, Michel Dänzer  wrote:
> On 30/06/17 08:43 PM, Marek Olšák wrote:
>>
>> I don't know what is being talked about here anymore, but I wouldn't
>> like to use CPU_ACCESS_REQUIRED or CPU_ACCESS_REALLY_REQUIRED in
>> userspace. The reason is that userspace doesn't and can't know whether
>> CPU access will be required, and the frequency at which it will be
>> required. 3 heaps {no CPU access, no flag, CPU access required} are
>> too many. Userspace mostly doesn't use the "no flag" heap for VRAM. It
>> uses "CPU access required" for almost everything except tiled
>> textures, which use "no CPU access".
>
> FWIW, the difference between setting CPU_ACCESS_REQUIRED and not setting
> it for a BO created in VRAM will be: If it's set, the BO is initially
> created in CPU visible VRAM, otherwise it's most likely created in CPU
> invisible VRAM.
>
> If userspace knows that a BO will likely be accessed by the CPU first,
> setting the flag could save a move from CPU invisible to CPU visible
> VRAM when the CPU access happens. Conversely, if a BO will likely never
> be accessed by the CPU, not setting the flag may reduce pressure on CPU
> visible VRAM.
>
> Not sure radeonsi can make this distinction though.

It can't.

Either all mappable BOs set CPU_ACCESS_REQUIRED, or all mappable BOs
don't set it. Either way, there is only one combination of flags for
mappable BOs in VRAM, and therefore only one kind of behavior the
kernel can follow.

>
>
>> I've been trying to trim down the number of heaps. So far, I have:
>> - VRAM_NO_CPU_ACCESS (implies WC)
>> - VRAM (implies WC)
>> - VRAM_GTT (combined, implies WC)
>
> Is this useful? It means:
>
> * The BO may be created in VRAM, or if there's no space, in GTT.
> * Once the BO is in GTT for any reason, it will never go back to VRAM.
>
> Such BOs will tend to end up in GTT after some time, at the latest after
> suspend/resume.
>
> I think it would be better for radeonsi to choose either VRAM or GTT as
> the preferred domain, and let the kernel handle it.

Currently, radeonsi on amdgpu doesn't use VRAM_GTT with the current kernel.

I'm aware of the limited usefulness.

Marek
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Make KIQ read/write register routine be atomic

2017-07-03 Thread Liu, Monk
Hi Shaoyun


looks you want to make KIQ reg access be atomic & UN-interruptible,

I think most user of KIQ reg access is not in atomic context, so your patch 
only benefit for the place using KIQ from IRQ.

why not implement another KIQ reg access function ? e.g. :

amdgpu_virt_kiq_rreg_atomic(...);
amdgpu_virt_kiq_wreg_atomic(...);


that way you can satisfy your requirement and not introduce unknown issue for 
other places that calling original virt_kiq_r/weg() function.


because busy polling have chance to hang cpu in SR-IOV (imagine 16 VF, and many 
VF doing FLR one by one,   your busy polling may let guest kernel cpu stuck in 
ATOMIC_CONTEXT for more than 5 seconds )


BR Monk


From: amd-gfx  on behalf of Liu, Shaoyun 

Sent: Friday, June 30, 2017 10:55:39 PM
To: Christian König; Michel Dänzer
Cc: amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: Make KIQ read/write register routine be atomic

Hi , Christian
The new code actually will not use the fence function , it just need a memory 
that expose both CPU and  GPU address .  Do you  really want to add the wrap 
functions that just expose the CPU and  GPU address in this case  ?

Regards
Shaoyun.liu


-Original Message-
From: Christian König [mailto:deathsim...@vodafone.de]
Sent: Friday, June 30, 2017 3:57 AM
To: Michel Dänzer; Liu, Shaoyun
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Make KIQ read/write register routine be atomic

Am 30.06.2017 um 03:21 schrieb Michel Dänzer:
> On 30/06/17 06:08 AM, Shaoyun Liu wrote:
>> 1. Use spin lock instead of mutex in KIQ 2. Directly write to KIQ
>> fence address instead of using fence_emit() 3. Disable the interrupt
>> for KIQ read/write and use CPU polling
> This list indicates that this patch should be split up in at least
> three patches. :)
Yeah, apart from that is is not a good idea to mess with the fence internals 
directly in the KIQ code, please add a helper in the fence code for this.

Regards,
Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH RFC 2/2] drm/amdgpu: Set/clear CPU_ACCESS flag on page fault and move to VRAM

2017-07-03 Thread Christian König

Am 03.07.2017 um 11:49 schrieb Michel Dänzer:

Instead of messing with all this I suggest that we just add a jiffies
based timeout to the BO when we can clear the flag. For kernel BOs this
timeout is just infinity.

Then we check in amdgpu_cs_bo_validate() before generating the
placements if we could clear the flag and do so based on the timeout.

The idea for this patch was to save the memory and CPU cycles needed for
that approach.
But when we clear the flag on the end of the move we already moved the 
BO to visible VRAM again.


Only on after the next swapout/swapin cycle we see an effect of that change.

Is that the intended approach?

Regards,
Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 3/4] drm/amd/powerplay: move VI common AVFS code to smu7_smumgr.c

2017-07-03 Thread Rex Zhu
Change-Id: I2bee3e700281a57ad77132794187ef45d2d79dcd
Signed-off-by: Rex Zhu 
---
 drivers/gpu/drm/amd/powerplay/inc/smumgr.h |  3 +
 drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c|  6 +-
 drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c | 73 ++
 drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.h | 11 
 .../gpu/drm/amd/powerplay/smumgr/polaris10_smc.c   |  4 +-
 .../drm/amd/powerplay/smumgr/polaris10_smumgr.c| 29 -
 .../drm/amd/powerplay/smumgr/polaris10_smumgr.h| 12 +---
 drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c |  6 +-
 drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.h |  8 ++-
 drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c  |  8 +++
 10 files changed, 74 insertions(+), 86 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/inc/smumgr.h 
b/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
index 976e942..5d61cc9 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
@@ -131,6 +131,7 @@ struct pp_smumgr_func {
bool (*is_dpm_running)(struct pp_hwmgr *hwmgr);
int (*populate_requested_graphic_levels)(struct pp_hwmgr *hwmgr,
struct amd_pp_profile *request);
+   bool (*is_hw_avfs_present)(struct pp_smumgr *smumgr);
 };
 
 struct pp_smumgr {
@@ -202,6 +203,8 @@ extern uint32_t smum_get_offsetof(struct pp_smumgr *smumgr,
 extern int smum_populate_requested_graphic_levels(struct pp_hwmgr *hwmgr,
struct amd_pp_profile *request);
 
+extern bool smum_is_hw_avfs_present(struct pp_smumgr *smumgr);
+
 #define SMUM_FIELD_SHIFT(reg, field) reg##__##field##__SHIFT
 
 #define SMUM_FIELD_MASK(reg, field) reg##__##field##_MASK
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c
index ca24e15..0750530 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c
@@ -2134,16 +2134,16 @@ int fiji_thermal_avfs_enable(struct pp_hwmgr *hwmgr)
 {
int ret;
struct pp_smumgr *smumgr = (struct pp_smumgr *)(hwmgr->smumgr);
-   struct fiji_smumgr *smu_data = (struct fiji_smumgr *)(smumgr->backend);
+   struct smu7_smumgr *smu_data = (struct smu7_smumgr *)(smumgr->backend);
 
-   if (smu_data->avfs.AvfsBtcStatus != AVFS_BTC_ENABLEAVFS)
+   if (smu_data->avfs.avfs_btc_param != AVFS_BTC_ENABLEAVFS)
return 0;
 
ret = smum_send_msg_to_smc(smumgr, PPSMC_MSG_EnableAvfs);
 
if (!ret)
/* If this param is not changed, this function could fire 
unnecessarily */
-   smu_data->avfs.AvfsBtcStatus = AVFS_BTC_COMPLETED_PREVIOUSLY;
+   smu_data->avfs.avfs_btc_param = AVFS_BTC_COMPLETED_PREVIOUSLY;
 
return ret;
 }
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
index 719e885..492f682 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
@@ -161,43 +161,46 @@ static int fiji_start_smu_in_non_protection_mode(struct 
pp_smumgr *smumgr)
 
 static int fiji_setup_pwr_virus(struct pp_smumgr *smumgr)
 {
-   int i, result = -1;
+   int i;
+   int result = -EINVAL;
uint32_t reg, data;
-   const PWR_Command_Table *virus = PwrVirusTable;
-   struct fiji_smumgr *priv = (struct fiji_smumgr *)(smumgr->backend);
 
-   priv->avfs.AvfsBtcStatus = AVFS_LOAD_VIRUS;
-   for (i = 0; (i < PWR_VIRUS_TABLE_SIZE); i++) {
-   switch (virus->command) {
+   const PWR_Command_Table *pvirus = PwrVirusTable;
+   struct smu7_smumgr *smu_data = (struct smu7_smumgr *)(smumgr->backend);
+
+   for (i = 0; i < PWR_VIRUS_TABLE_SIZE; i++) {
+   switch (pvirus->command) {
case PwrCmdWrite:
-   reg  = virus->reg;
-   data = virus->data;
+   reg  = pvirus->reg;
+   data = pvirus->data;
cgs_write_register(smumgr->device, reg, data);
break;
+
case PwrCmdEnd:
-   priv->avfs.AvfsBtcStatus = AVFS_BTC_VIRUS_LOADED;
result = 0;
break;
+
default:
-   pr_err("Table Exit with Invalid Command!");
-   priv->avfs.AvfsBtcStatus = AVFS_BTC_VIRUS_FAIL;
-   result = -1;
+   pr_info("Table Exit with Invalid Command!");
+   smu_data->avfs.avfs_btc_status = AVFS_BTC_VIRUS_FAIL;
+   result = -EINVAL;
break;
}
-   virus++;
+   pvirus++;
}
+
return result;
 }
 
 static int fiji_start_avfs_btc(struct pp_smumgr *smumgr)
 {
int result = 0;
-   struct 

[PATCH 2/4] drm/amd/powerplay: refine avfs enable code on fiji.

2017-07-03 Thread Rex Zhu
1. simplify avfs state switch.
2. delete save/restore VFT table functions as not support
   by fiji.
3. implement thermal_avfs_enable funciton.

Change-Id: I5e28672591d77b1ec9e3406d0fc7d42566831a08
Signed-off-by: Rex Zhu 
---
 drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c|  19 
 drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.h|   1 +
 drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c | 115 ++---
 3 files changed, 28 insertions(+), 107 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c
index 6a320b2..ca24e15 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c
@@ -2129,6 +2129,25 @@ int fiji_thermal_setup_fan_table(struct pp_hwmgr *hwmgr)
return 0;
 }
 
+
+int fiji_thermal_avfs_enable(struct pp_hwmgr *hwmgr)
+{
+   int ret;
+   struct pp_smumgr *smumgr = (struct pp_smumgr *)(hwmgr->smumgr);
+   struct fiji_smumgr *smu_data = (struct fiji_smumgr *)(smumgr->backend);
+
+   if (smu_data->avfs.AvfsBtcStatus != AVFS_BTC_ENABLEAVFS)
+   return 0;
+
+   ret = smum_send_msg_to_smc(smumgr, PPSMC_MSG_EnableAvfs);
+
+   if (!ret)
+   /* If this param is not changed, this function could fire 
unnecessarily */
+   smu_data->avfs.AvfsBtcStatus = AVFS_BTC_COMPLETED_PREVIOUSLY;
+
+   return ret;
+}
+
 static int fiji_program_mem_timing_parameters(struct pp_hwmgr *hwmgr)
 {
struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.h 
b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.h
index 0e9e1f2..d9c72d9 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.h
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.h
@@ -48,5 +48,6 @@ struct fiji_pt_defaults {
 bool fiji_is_dpm_running(struct pp_hwmgr *hwmgr);
 int fiji_populate_requested_graphic_levels(struct pp_hwmgr *hwmgr,
struct amd_pp_profile *request);
+int fiji_thermal_avfs_enable(struct pp_hwmgr *hwmgr);
 #endif
 
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
index a1cb785..719e885 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
@@ -194,22 +194,10 @@ static int fiji_start_avfs_btc(struct pp_smumgr *smumgr)
int result = 0;
struct fiji_smumgr *priv = (struct fiji_smumgr *)(smumgr->backend);
 
-   priv->avfs.AvfsBtcStatus = AVFS_BTC_STARTED;
if (priv->avfs.AvfsBtcParam) {
if (!smum_send_msg_to_smc_with_parameter(smumgr,
PPSMC_MSG_PerformBtc, priv->avfs.AvfsBtcParam)) 
{
-   if (!smum_send_msg_to_smc(smumgr, 
PPSMC_MSG_EnableAvfs)) {
-   priv->avfs.AvfsBtcStatus = 
AVFS_BTC_COMPLETED_UNSAVED;
-   result = 0;
-   } else {
-   pr_err("[AVFS][fiji_start_avfs_btc] Attempt"
-   " to Enable AVFS Failed!");
-   smum_send_msg_to_smc(smumgr, 
PPSMC_MSG_DisableAvfs);
-   result = -1;
-   }
-   } else {
-   pr_err("[AVFS][fiji_start_avfs_btc] "
-   "PerformBTC SMU msg failed");
+   pr_err("PerformBTC SMU msg failed \n");
result = -1;
}
}
@@ -224,42 +212,6 @@ static int fiji_start_avfs_btc(struct pp_smumgr *smumgr)
return result;
 }
 
-static int fiji_setup_pm_fuse_for_avfs(struct pp_smumgr *smumgr)
-{
-   int result = 0;
-   uint32_t table_start;
-   uint32_t charz_freq_addr, inversion_voltage_addr, charz_freq;
-   uint16_t inversion_voltage;
-
-   charz_freq = 0x3075; /* In 10KHz units 0x7530 Actual value */
-   inversion_voltage = 0x1A04; /* mV Q14.2 0x41A Actual value */
-
-   PP_ASSERT_WITH_CODE(0 == smu7_read_smc_sram_dword(smumgr,
-   SMU7_FIRMWARE_HEADER_LOCATION + 
offsetof(SMU73_Firmware_Header,
-   PmFuseTable), _start, 0x4),
-   "[AVFS][Fiji_SetupGfxLvlStruct] SMU could not 
communicate "
-   "starting address of PmFuse structure",
-   return -1;);
-
-   charz_freq_addr = table_start +
-   offsetof(struct SMU73_Discrete_PmFuses, PsmCharzFreq);
-   inversion_voltage_addr = table_start +
-   offsetof(struct SMU73_Discrete_PmFuses, 
InversionVoltage);
-
-   result = smu7_copy_bytes_to_smc(smumgr, charz_freq_addr,
-   (uint8_t *)(_freq), sizeof(charz_freq), 0x4);
-   PP_ASSERT_WITH_CODE(0 == result,
-   

[PATCH 4/4] drm/amd/powerplay: add avfs check for old asics on Vi.

2017-07-03 Thread Rex Zhu
Change-Id: I1737ba27ae2a8f5c579ccb541ced9cb979ffd1ff
Signed-off-by: Rex Zhu 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
index e7ecbd1..8fe62aa 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
@@ -4660,6 +4660,15 @@ static int smu7_set_power_profile_state(struct pp_hwmgr 
*hwmgr,
 
 static int smu7_avfs_control(struct pp_hwmgr *hwmgr, bool enable)
 {
+   struct pp_smumgr *smumgr = (struct pp_smumgr *)(hwmgr->smumgr);
+   struct smu7_smumgr *smu_data = (struct smu7_smumgr *)(smumgr->backend);
+
+   if (smu_data == NULL)
+   return -EINVAL;
+
+   if (smu_data->avfs.avfs_btc_status == AVFS_BTC_NOTSUPPORTED)
+   return 0;
+
if (enable) {
if (!PHM_READ_VFPF_INDIRECT_FIELD(hwmgr->device,
CGS_IND_REG__SMC, FEATURE_STATUS, AVS_ON))
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: fix vulkan test performance drop and hang on VI

2017-07-03 Thread Rex Zhu
caused by not program dynamic_cu_mask_addr in the KIQ MQD.

v2: create struct vi_mqd_allocation in FB which will contain
1. PM4 MQD structure.
2. Write Pointer Poll Memory.
3. Read Pointer Report Memory
4. Dynamic CU Mask.
5. Dynamic RB Mask.

Change-Id: I22c840f1bf8d365f7df33a27d6b11e1aea8f2958
Signed-off-by: Rex Zhu 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c|  27 ++--
 drivers/gpu/drm/amd/include/vi_structs.h | 268 +++
 2 files changed, 285 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 1a75ab1..452cc5b 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -40,7 +40,6 @@
 
 #include "bif/bif_5_0_d.h"
 #include "bif/bif_5_0_sh_mask.h"
-
 #include "gca/gfx_8_0_d.h"
 #include "gca/gfx_8_0_enum.h"
 #include "gca/gfx_8_0_sh_mask.h"
@@ -2100,7 +2099,7 @@ static int gfx_v8_0_sw_init(void *handle)
return r;
 
/* create MQD for all compute queues as well as KIQ for SRIOV case */
-   r = amdgpu_gfx_compute_mqd_sw_init(adev, sizeof(struct vi_mqd));
+   r = amdgpu_gfx_compute_mqd_sw_init(adev, sizeof(struct 
vi_mqd_allocation));
if (r)
return r;
 
@@ -4715,9 +4714,6 @@ static int gfx_v8_0_mqd_init(struct amdgpu_ring *ring)
uint64_t hqd_gpu_addr, wb_gpu_addr, eop_base_addr;
uint32_t tmp;
 
-   /* init the mqd struct */
-   memset(mqd, 0, sizeof(struct vi_mqd));
-
mqd->header = 0xC0310800;
mqd->compute_pipelinestat_enable = 0x0001;
mqd->compute_static_thread_mgmt_se0 = 0x;
@@ -4725,7 +4721,12 @@ static int gfx_v8_0_mqd_init(struct amdgpu_ring *ring)
mqd->compute_static_thread_mgmt_se2 = 0x;
mqd->compute_static_thread_mgmt_se3 = 0x;
mqd->compute_misc_reserved = 0x0003;
-
+   if (!(adev->flags & AMD_IS_APU)) {
+   mqd->dynamic_cu_mask_addr_lo = lower_32_bits(ring->mqd_gpu_addr
++ offsetof(struct 
vi_mqd_allocation, dyamic_cu_mask));
+   mqd->dynamic_cu_mask_addr_hi = upper_32_bits(ring->mqd_gpu_addr
++ offsetof(struct 
vi_mqd_allocation, dyamic_cu_mask));
+   }
eop_base_addr = ring->eop_gpu_addr >> 8;
mqd->cp_hqd_eop_base_addr_lo = eop_base_addr;
mqd->cp_hqd_eop_base_addr_hi = upper_32_bits(eop_base_addr);
@@ -4900,7 +4901,7 @@ static int gfx_v8_0_kiq_init_queue(struct amdgpu_ring 
*ring)
if (adev->gfx.in_reset) { /* for GPU_RESET case */
/* reset MQD to a clean status */
if (adev->gfx.mec.mqd_backup[mqd_idx])
-   memcpy(mqd, adev->gfx.mec.mqd_backup[mqd_idx], 
sizeof(*mqd));
+   memcpy(mqd, adev->gfx.mec.mqd_backup[mqd_idx], 
sizeof(struct vi_mqd_allocation));
 
/* reset ring buffer */
ring->wptr = 0;
@@ -4916,6 +4917,9 @@ static int gfx_v8_0_kiq_init_queue(struct amdgpu_ring 
*ring)
vi_srbm_select(adev, 0, 0, 0, 0);
mutex_unlock(>srbm_mutex);
} else {
+   memset((void *)mqd, 0, sizeof(struct vi_mqd_allocation));
+   ((struct vi_mqd_allocation *)mqd)->dyamic_cu_mask = 0x;
+   ((struct vi_mqd_allocation *)mqd)->dyamic_rb_mask = 0x;
mutex_lock(>srbm_mutex);
vi_srbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
gfx_v8_0_mqd_init(ring);
@@ -4929,7 +4933,7 @@ static int gfx_v8_0_kiq_init_queue(struct amdgpu_ring 
*ring)
mutex_unlock(>srbm_mutex);
 
if (adev->gfx.mec.mqd_backup[mqd_idx])
-   memcpy(adev->gfx.mec.mqd_backup[mqd_idx], mqd, 
sizeof(*mqd));
+   memcpy(adev->gfx.mec.mqd_backup[mqd_idx], mqd, 
sizeof(struct vi_mqd_allocation));
}
 
return r;
@@ -4947,6 +4951,9 @@ static int gfx_v8_0_kcq_init_queue(struct amdgpu_ring 
*ring)
int mqd_idx = ring - >gfx.compute_ring[0];
 
if (!adev->gfx.in_reset && !adev->gfx.in_suspend) {
+   memset((void *)mqd, 0, sizeof(struct vi_mqd_allocation));
+   ((struct vi_mqd_allocation *)mqd)->dyamic_cu_mask = 0x;
+   ((struct vi_mqd_allocation *)mqd)->dyamic_rb_mask = 0x;
mutex_lock(>srbm_mutex);
vi_srbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
gfx_v8_0_mqd_init(ring);
@@ -4954,11 +4961,11 @@ static int gfx_v8_0_kcq_init_queue(struct amdgpu_ring 
*ring)
mutex_unlock(>srbm_mutex);
 
if (adev->gfx.mec.mqd_backup[mqd_idx])
-   memcpy(adev->gfx.mec.mqd_backup[mqd_idx], mqd, 
sizeof(*mqd));
+   memcpy(adev->gfx.mec.mqd_backup[mqd_idx], mqd, 
sizeof(struct vi_mqd_allocation));
} 

Re: Deprecation of AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED

2017-07-03 Thread Michel Dänzer
On 30/06/17 08:43 PM, Marek Olšák wrote:
> 
> I don't know what is being talked about here anymore, but I wouldn't
> like to use CPU_ACCESS_REQUIRED or CPU_ACCESS_REALLY_REQUIRED in
> userspace. The reason is that userspace doesn't and can't know whether
> CPU access will be required, and the frequency at which it will be
> required. 3 heaps {no CPU access, no flag, CPU access required} are
> too many. Userspace mostly doesn't use the "no flag" heap for VRAM. It
> uses "CPU access required" for almost everything except tiled
> textures, which use "no CPU access".

FWIW, the difference between setting CPU_ACCESS_REQUIRED and not setting
it for a BO created in VRAM will be: If it's set, the BO is initially
created in CPU visible VRAM, otherwise it's most likely created in CPU
invisible VRAM.

If userspace knows that a BO will likely be accessed by the CPU first,
setting the flag could save a move from CPU invisible to CPU visible
VRAM when the CPU access happens. Conversely, if a BO will likely never
be accessed by the CPU, not setting the flag may reduce pressure on CPU
visible VRAM.

Not sure radeonsi can make this distinction though.


> I've been trying to trim down the number of heaps. So far, I have:
> - VRAM_NO_CPU_ACCESS (implies WC)
> - VRAM (implies WC)
> - VRAM_GTT (combined, implies WC)

Is this useful? It means:

* The BO may be created in VRAM, or if there's no space, in GTT.
* Once the BO is in GTT for any reason, it will never go back to VRAM.

Such BOs will tend to end up in GTT after some time, at the latest after
suspend/resume.

I think it would be better for radeonsi to choose either VRAM or GTT as
the preferred domain, and let the kernel handle it.


> - GTT_WC
> - GTT




-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH RFC v2] drm/amdgpu: Set/clear CPU_ACCESS flag on page fault and move to VRAM

2017-07-03 Thread Michel Dänzer
On 01/07/17 12:31 AM, John Brooks wrote:
> When a BO is moved to VRAM, clear AMDGPU_BO_FLAG_CPU_ACCESS. This allows it
> to potentially later move to invisible VRAM if the CPU does not access it
> again.
> 
> Setting the CPU_ACCESS flag in amdgpu_bo_fault_reserve_notify() also means
> that we can remove the loop to restrict lpfn to the end of visible VRAM,
> because amdgpu_ttm_placement_init() will do it for us.
> 
> Signed-off-by: John Brooks 

[...]

> @@ -446,6 +448,12 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object 
> *bo,
>   if (unlikely(r)) {
>   goto out_cleanup;
>   }
> +
> + /* The page fault handler will re-set this if the CPU accesses the BO
> +  * after it's moved.
> +  */
> + abo->flags &= ~AMDGPU_BO_FLAG_CPU_ACCESS;

I've come to realize that the flag also needs to be cleared at the end
of amdgpu_bo_create_restricted, otherwise we will incorrectly assume
that every BO created with this flag has been accessed by the CPU.


BTW, there's also a minor issue here in that the flags member is u64,
but the flags are defined as ints. Probably doesn't matter so far, but
it would as soon as any flag's value is >= (1 << 32).


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH RFC 2/2] drm/amdgpu: Set/clear CPU_ACCESS flag on page fault and move to VRAM

2017-07-03 Thread Michel Dänzer
On 03/07/17 04:06 PM, Christian König wrote:
> Am 03.07.2017 um 03:34 schrieb Michel Dänzer:
> 
>> [...] I suggested clearing the flag here to John on IRC. The
>> idea is briefly described in the commit log, let me elaborate a bit on
>> that:
>>
>> When a BO is moved to VRAM which has the AMDGPU_BO_FLAG_CPU_ACCESS flag
>> set, it is put in CPU visible VRAM, and the flag is cleared. If the CPU
>> doesn't access the BO, the next time it will be moved to VRAM (after it
>> was evicted from there, for any reason), the flag will no longer be set,
>> and the BO will likely be moved to CPU invisible VRAM.
>>
>> If the BO is accessed by the CPU again though (no matter where the BO is
>> currently located at that time), the flag is set again, and the cycle
>> from the previous paragraph starts over.
>>
>> The end result should be similar as with the timestamp based solution in
>> John's earlier series: BOs which are at least occasionally accessed by
>> the CPU will tend to be in CPU visible VRAM, those which are never
>> accessed by the CPU can be in CPU invisible VRAM.
> Yeah, I understand the intention. But the implementation isn't even
> remotely correct.
> 
> First of all the flag must be cleared in the CS which wants to move the
> BO, not in the move functions when the decision where to put it is
> already made.

For the purpose of this patch, we should clear the flag when the BO is
actually moved to VRAM, regardless of how or why.


> Second currently the flag is set on page fault, but never cleared
> because the place where the code to clear it was added is just
> completely incorrect (see above).

My bad, thanks for pointing this out. The following at the end of
amdgpu_bo_move should do the trick:

if (new_mem->mem_type == TTM_PL_VRAM &&
old_mem->mem_type != TTM_PL_VRAM) {
/* amdgpu_bo_fault_reserve_notify will re-set this if
 * the CPU accesses the BO after it's moved.
 */
abo->flags &= ~AMDGPU_BO_FLAG_CPU_ACCESS;
}


> Instead of messing with all this I suggest that we just add a jiffies
> based timeout to the BO when we can clear the flag. For kernel BOs this
> timeout is just infinity.
> 
> Then we check in amdgpu_cs_bo_validate() before generating the
> placements if we could clear the flag and do so based on the timeout.

The idea for this patch was to save the memory and CPU cycles needed for
that approach.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 11/11] drm/amdgpu: add sysvm_size

2017-07-03 Thread Christian König
From: Christian König 

Limit the size of the SYSVM. This saves us a bunch of visible VRAM,
but also limitates the maximum BO size we can swap out.

v2: rebased and cleaned up after GART to SYSVM rename.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  | 6 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 6 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.c   | 9 +
 5 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 1ed6b7a..81de31a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -77,6 +77,7 @@
 extern int amdgpu_modeset;
 extern int amdgpu_vram_limit;
 extern int amdgpu_gart_size;
+extern unsigned amdgpu_sysvm_size;
 extern int amdgpu_moverate;
 extern int amdgpu_benchmarking;
 extern int amdgpu_testing;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 228b262..daded9c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1086,6 +1086,12 @@ static void amdgpu_check_arguments(struct amdgpu_device 
*adev)
}
}
 
+   if (amdgpu_sysvm_size < 32) {
+   dev_warn(adev->dev, "sysvm size (%d) too small\n",
+amdgpu_sysvm_size);
+   amdgpu_sysvm_size = 32;
+   }
+
amdgpu_check_vm_size(adev);
 
amdgpu_check_block_size(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 4bf4a80..56f9867 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -75,6 +75,7 @@
 
 int amdgpu_vram_limit = 0;
 int amdgpu_gart_size = -1; /* auto */
+unsigned amdgpu_sysvm_size = 256;
 int amdgpu_moverate = -1; /* auto */
 int amdgpu_benchmarking = 0;
 int amdgpu_testing = 0;
@@ -124,6 +125,9 @@ module_param_named(vramlimit, amdgpu_vram_limit, int, 0600);
 MODULE_PARM_DESC(gartsize, "Size of PCIE/IGP gart to setup in megabytes (32, 
64, etc., -1 = auto)");
 module_param_named(gartsize, amdgpu_gart_size, int, 0600);
 
+MODULE_PARM_DESC(sysvmsize, "Size of the system VM in megabytes (default 
256)");
+module_param_named(sysvmsize, amdgpu_sysvm_size, int, 0600);
+
 MODULE_PARM_DESC(moverate, "Maximum buffer migration rate in MB/s. (32, 64, 
etc., -1=auto, 0=1=disabled)");
 module_param_named(moverate, amdgpu_moverate, int, 0600);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index f46a97d..bbf6bd0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -42,6 +42,7 @@ struct amdgpu_gtt_mgr {
 static int amdgpu_gtt_mgr_init(struct ttm_mem_type_manager *man,
   unsigned long p_size)
 {
+   struct amdgpu_device *adev = amdgpu_ttm_adev(man->bdev);
struct amdgpu_gtt_mgr *mgr;
uint64_t start, size;
 
@@ -50,7 +51,7 @@ static int amdgpu_gtt_mgr_init(struct ttm_mem_type_manager 
*man,
return -ENOMEM;
 
start = AMDGPU_GTT_MAX_TRANSFER_SIZE * AMDGPU_GTT_NUM_TRANSFER_WINDOWS;
-   size = p_size - start;
+   size = (adev->mc.sysvm_size >> PAGE_SHIFT) - start;
drm_mm_init(>mm, start, size);
spin_lock_init(>lock);
mgr->available = p_size;
@@ -112,6 +113,7 @@ int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man,
 const struct ttm_place *place,
 struct ttm_mem_reg *mem)
 {
+   struct amdgpu_device *adev = amdgpu_ttm_adev(man->bdev);
struct amdgpu_gtt_mgr *mgr = man->priv;
struct drm_mm_node *node = mem->mm_node;
enum drm_mm_insert_mode mode;
@@ -129,7 +131,7 @@ int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man,
if (place && place->lpfn)
lpfn = place->lpfn;
else
-   lpfn = man->size;
+   lpfn = adev->sysvm.num_cpu_pages;
 
mode = DRM_MM_INSERT_BEST;
if (place && place->flags & TTM_PL_FLAG_TOPDOWN)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.c
index ff436ad..711e4b6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.c
@@ -62,14 +62,7 @@
  */
 void amdgpu_sysvm_set_defaults(struct amdgpu_device *adev)
 {
-   /* unless the user had overridden it, set the gart
-* size equal to the 1024 or vram, whichever is larger.
-*/
-   if (amdgpu_gart_size == -1)
-   adev->mc.sysvm_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
-   adev->mc.mc_vram_size);
-   else
-   

[PATCH 10/11] drm/amdgpu: setup GTT size directly from module parameter

2017-07-03 Thread Christian König
From: Christian König 

Instead of relying on the sysvm_size to be the same as the module parameter.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 9240357..72dd83e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1097,6 +1097,7 @@ static struct ttm_bo_driver amdgpu_bo_driver = {
 
 int amdgpu_ttm_init(struct amdgpu_device *adev)
 {
+   uint64_t gtt_size;
int r;
 
r = amdgpu_ttm_global_init(adev);
@@ -1143,14 +1144,19 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
}
DRM_INFO("amdgpu: %uM of VRAM memory ready\n",
 (unsigned) (adev->mc.real_vram_size / (1024 * 1024)));
-   r = ttm_bo_init_mm(>mman.bdev, TTM_PL_TT,
-   adev->mc.sysvm_size >> PAGE_SHIFT);
+
+   if (amdgpu_gart_size == -1)
+   gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
+  adev->mc.mc_vram_size);
+   else
+   gtt_size = (uint64_t)amdgpu_gart_size << 20;
+   r = ttm_bo_init_mm(>mman.bdev, TTM_PL_TT, gtt_size >> PAGE_SHIFT);
if (r) {
DRM_ERROR("Failed initializing GTT heap.\n");
return r;
}
DRM_INFO("amdgpu: %uM of GTT memory ready.\n",
-(unsigned)(adev->mc.sysvm_size / (1024 * 1024)));
+(unsigned)(gtt_size / (1024 * 1024)));
 
adev->gds.mem.total_size = adev->gds.mem.total_size << AMDGPU_GDS_SHIFT;
adev->gds.mem.gfx_partition_size = adev->gds.mem.gfx_partition_size << 
AMDGPU_GDS_SHIFT;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 08/11] drm/amdgpu: move SYSVM struct and function into amdgpu_sysvm.h

2017-07-03 Thread Christian König
From: Christian König 

No functional change.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   | 48 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.h | 77 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |  1 +
 3 files changed, 79 insertions(+), 47 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.h

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index abe191f..a2c0eac 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -69,6 +69,7 @@
 
 #include "gpu_scheduler.h"
 #include "amdgpu_virt.h"
+#include "amdgpu_sysvm.h"
 
 /*
  * Modules parameters.
@@ -534,53 +535,6 @@ int amdgpu_fence_slab_init(void);
 void amdgpu_fence_slab_fini(void);
 
 /*
- * GART structures, functions & helpers
- */
-struct amdgpu_mc;
-
-#define AMDGPU_GPU_PAGE_SIZE 4096
-#define AMDGPU_GPU_PAGE_MASK (AMDGPU_GPU_PAGE_SIZE - 1)
-#define AMDGPU_GPU_PAGE_SHIFT 12
-#define AMDGPU_GPU_PAGE_ALIGN(a) (((a) + AMDGPU_GPU_PAGE_MASK) & 
~AMDGPU_GPU_PAGE_MASK)
-
-struct amdgpu_sysvm {
-   dma_addr_t  table_addr;
-   struct amdgpu_bo*robj;
-   void*ptr;
-   unsignednum_gpu_pages;
-   unsignednum_cpu_pages;
-   unsignedtable_size;
-#ifdef CONFIG_DRM_AMDGPU_SYSVM_DEBUGFS
-   struct page **pages;
-#endif
-   boolready;
-
-   /* Asic default pte flags */
-   uint64_tsysvm_pte_flags;
-
-   const struct amdgpu_sysvm_funcs *sysvm_funcs;
-};
-
-void amdgpu_sysvm_set_defaults(struct amdgpu_device *adev);
-int amdgpu_sysvm_table_ram_alloc(struct amdgpu_device *adev);
-void amdgpu_sysvm_table_ram_free(struct amdgpu_device *adev);
-int amdgpu_sysvm_table_vram_alloc(struct amdgpu_device *adev);
-void amdgpu_sysvm_table_vram_free(struct amdgpu_device *adev);
-int amdgpu_sysvm_table_vram_pin(struct amdgpu_device *adev);
-void amdgpu_sysvm_table_vram_unpin(struct amdgpu_device *adev);
-int amdgpu_sysvm_init(struct amdgpu_device *adev);
-void amdgpu_sysvm_fini(struct amdgpu_device *adev);
-int amdgpu_sysvm_unbind(struct amdgpu_device *adev, uint64_t offset,
-   int pages);
-int amdgpu_sysvm_map(struct amdgpu_device *adev, uint64_t offset,
-   int pages, dma_addr_t *dma_addr, uint64_t flags,
-   void *dst);
-int amdgpu_sysvm_bind(struct amdgpu_device *adev, uint64_t offset,
-int pages, struct page **pagelist,
-dma_addr_t *dma_addr, uint64_t flags);
-int amdgpu_ttm_recover_gart(struct amdgpu_device *adev);
-
-/*
  * VMHUB structures, functions & helpers
  */
 struct amdgpu_vmhub {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.h
new file mode 100644
index 000..7846765
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.h
@@ -0,0 +1,77 @@
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __AMDGPU_SYSVM_H__
+#define __AMDGPU_SYSVM_H__
+
+#include 
+
+/*
+ * SYSVM structures, functions & helpers
+ */
+struct amdgpu_device;
+struct amdgpu_bo;
+struct amdgpu_sysvm_funcs;
+
+#define AMDGPU_GPU_PAGE_SIZE 4096
+#define AMDGPU_GPU_PAGE_MASK (AMDGPU_GPU_PAGE_SIZE - 1)
+#define AMDGPU_GPU_PAGE_SHIFT 12
+#define AMDGPU_GPU_PAGE_ALIGN(a) (((a) + AMDGPU_GPU_PAGE_MASK) & 
~AMDGPU_GPU_PAGE_MASK)
+
+struct amdgpu_sysvm {
+   dma_addr_t  table_addr;
+   struct amdgpu_bo*robj;
+   void*ptr;
+   unsignednum_gpu_pages;
+   unsigned

[PATCH 03/11] drm/amdgpu: use the GTT windows for BO moves v2

2017-07-03 Thread Christian König
From: Christian König 

This way we don't need to map the full BO at a time any more.

v2: use fixed windows for src/dst

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 125 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |   2 +
 2 files changed, 108 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 15148f1..1fc9866 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -47,10 +47,15 @@
 
 #define DRM_FILE_PAGE_OFFSET (0x1ULL >> PAGE_SHIFT)
 
+static int amdgpu_map_buffer(struct ttm_buffer_object *bo,
+struct ttm_mem_reg *mem, unsigned num_pages,
+uint64_t offset, unsigned window,
+struct amdgpu_ring *ring,
+uint64_t *addr);
+
 static int amdgpu_ttm_debugfs_init(struct amdgpu_device *adev);
 static void amdgpu_ttm_debugfs_fini(struct amdgpu_device *adev);
 
-
 /*
  * Global memory.
  */
@@ -97,6 +102,8 @@ static int amdgpu_ttm_global_init(struct amdgpu_device *adev)
goto error_bo;
}
 
+   mutex_init(>mman.gtt_window_lock);
+
ring = adev->mman.buffer_funcs_ring;
rq = >sched.sched_rq[AMD_SCHED_PRIORITY_KERNEL];
r = amd_sched_entity_init(>sched, >mman.entity,
@@ -123,6 +130,7 @@ static void amdgpu_ttm_global_fini(struct amdgpu_device 
*adev)
if (adev->mman.mem_global_referenced) {
amd_sched_entity_fini(adev->mman.entity.sched,
  >mman.entity);
+   mutex_destroy(>mman.gtt_window_lock);
drm_global_item_unref(>mman.bo_global_ref.ref);
drm_global_item_unref(>mman.mem_global_ref);
adev->mman.mem_global_referenced = false;
@@ -256,10 +264,13 @@ static uint64_t amdgpu_mm_node_addr(struct 
ttm_buffer_object *bo,
struct drm_mm_node *mm_node,
struct ttm_mem_reg *mem)
 {
-   uint64_t addr;
+   uint64_t addr = 0;
 
-   addr = mm_node->start << PAGE_SHIFT;
-   addr += bo->bdev->man[mem->mem_type].gpu_offset;
+   if (mem->mem_type != TTM_PL_TT ||
+   amdgpu_gtt_mgr_is_allocated(mem)) {
+   addr = mm_node->start << PAGE_SHIFT;
+   addr += bo->bdev->man[mem->mem_type].gpu_offset;
+   }
return addr;
 }
 
@@ -284,34 +295,41 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
return -EINVAL;
}
 
-   if (old_mem->mem_type == TTM_PL_TT) {
-   r = amdgpu_ttm_bind(bo, old_mem);
-   if (r)
-   return r;
-   }
-
old_mm = old_mem->mm_node;
old_size = old_mm->size;
old_start = amdgpu_mm_node_addr(bo, old_mm, old_mem);
 
-   if (new_mem->mem_type == TTM_PL_TT) {
-   r = amdgpu_ttm_bind(bo, new_mem);
-   if (r)
-   return r;
-   }
-
new_mm = new_mem->mm_node;
new_size = new_mm->size;
new_start = amdgpu_mm_node_addr(bo, new_mm, new_mem);
 
num_pages = new_mem->num_pages;
+   mutex_lock(>mman.gtt_window_lock);
while (num_pages) {
-   unsigned long cur_pages = min(old_size, new_size);
+   unsigned long cur_pages = min(min(old_size, new_size),
+ 
(u64)AMDGPU_GTT_MAX_TRANSFER_SIZE);
+   uint64_t from = old_start, to = new_start;
struct dma_fence *next;
 
-   r = amdgpu_copy_buffer(ring, old_start, new_start,
+   if (old_mem->mem_type == TTM_PL_TT &&
+   !amdgpu_gtt_mgr_is_allocated(old_mem)) {
+   r = amdgpu_map_buffer(bo, old_mem, cur_pages,
+ old_start, 0, ring, );
+   if (r)
+   goto error;
+   }
+
+   if (new_mem->mem_type == TTM_PL_TT &&
+   !amdgpu_gtt_mgr_is_allocated(new_mem)) {
+   r = amdgpu_map_buffer(bo, new_mem, cur_pages,
+ new_start, 1, ring, );
+   if (r)
+   goto error;
+   }
+
+   r = amdgpu_copy_buffer(ring, from, to,
   cur_pages * PAGE_SIZE,
-  bo->resv, , false, false);
+  bo->resv, , false, true);
if (r)
goto error;
 
@@ -338,12 +356,15 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
new_start += cur_pages * PAGE_SIZE;
}
}
+   

[PATCH 09/11] drm/amdgpu: move amdgpu_sysvm_location into amdgpu_sysvm.c as well

2017-07-03 Thread Christian König
From: Christian König 

No intended functional change.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 36 
 drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.c  | 38 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.h  |  2 ++
 4 files changed, 40 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index a2c0eac..1ed6b7a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1862,7 +1862,6 @@ bool amdgpu_ttm_tt_is_readonly(struct ttm_tt *ttm);
 uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt 
*ttm,
 struct ttm_mem_reg *mem);
 void amdgpu_vram_location(struct amdgpu_device *adev, struct amdgpu_mc *mc, 
u64 base);
-void amdgpu_sysvm_location(struct amdgpu_device *adev, struct amdgpu_mc *mc);
 void amdgpu_ttm_set_active_vram_size(struct amdgpu_device *adev, u64 size);
 int amdgpu_ttm_init(struct amdgpu_device *adev);
 void amdgpu_ttm_fini(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 46a82d3..228b262 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -666,42 +666,6 @@ void amdgpu_vram_location(struct amdgpu_device *adev, 
struct amdgpu_mc *mc, u64
mc->vram_end, mc->real_vram_size >> 20);
 }
 
-/**
- * amdgpu_sysvm_location - try to find SYSVM location
- * @adev: amdgpu device structure holding all necessary informations
- * @mc: memory controller structure holding memory informations
- *
- * Function will place try to place SYSVM before or after VRAM.
- *
- * If SYSVM size is bigger than space left then we ajust SYSVM size.
- * Thus function will never fails.
- *
- * FIXME: when reducing SYSVM size align new size on power of 2.
- */
-void amdgpu_sysvm_location(struct amdgpu_device *adev, struct amdgpu_mc *mc)
-{
-   u64 size_af, size_bf;
-
-   size_af = ((adev->mc.mc_mask - mc->vram_end) + mc->sysvm_base_align) & 
~mc->sysvm_base_align;
-   size_bf = mc->vram_start & ~mc->sysvm_base_align;
-   if (size_bf > size_af) {
-   if (mc->sysvm_size > size_bf) {
-   dev_warn(adev->dev, "limiting SYSVM\n");
-   mc->sysvm_size = size_bf;
-   }
-   mc->sysvm_start = 0;
-   } else {
-   if (mc->sysvm_size > size_af) {
-   dev_warn(adev->dev, "limiting SYSVM\n");
-   mc->sysvm_size = size_af;
-   }
-   mc->sysvm_start = (mc->vram_end + 1 + mc->sysvm_base_align) & 
~mc->sysvm_base_align;
-   }
-   mc->sysvm_end = mc->sysvm_start + mc->sysvm_size - 1;
-   dev_info(adev->dev, "SYSVM: %lluM 0x%016llX - 0x%016llX\n",
-   mc->sysvm_size >> 20, mc->sysvm_start, mc->sysvm_end);
-}
-
 /*
  * GPU helpers function.
  */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.c
index 50fc8d7..ff436ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.c
@@ -73,6 +73,44 @@ void amdgpu_sysvm_set_defaults(struct amdgpu_device *adev)
 }
 
 /**
+ * amdgpu_sysvm_location - try to find SYSVM location
+ * @adev: amdgpu device structure holding all necessary informations
+ * @mc: memory controller structure holding memory informations
+ *
+ * Function will place try to place SYSVM before or after VRAM.
+ *
+ * If SYSVM size is bigger than space left then we ajust SYSVM size.
+ * Thus function will never fails.
+ *
+ * FIXME: when reducing SYSVM size align new size on power of 2.
+ */
+void amdgpu_sysvm_location(struct amdgpu_device *adev, struct amdgpu_mc *mc)
+{
+   u64 size_af, size_bf;
+
+   size_af = ((adev->mc.mc_mask - mc->vram_end) + mc->sysvm_base_align) &
+   ~mc->sysvm_base_align;
+   size_bf = mc->vram_start & ~mc->sysvm_base_align;
+   if (size_bf > size_af) {
+   if (mc->sysvm_size > size_bf) {
+   dev_warn(adev->dev, "limiting SYSVM\n");
+   mc->sysvm_size = size_bf;
+   }
+   mc->sysvm_start = 0;
+   } else {
+   if (mc->sysvm_size > size_af) {
+   dev_warn(adev->dev, "limiting SYSVM\n");
+   mc->sysvm_size = size_af;
+   }
+   mc->sysvm_start = (mc->vram_end + 1 + mc->sysvm_base_align) &
+   ~mc->sysvm_base_align;
+   }
+   mc->sysvm_end = mc->sysvm_start + mc->sysvm_size - 1;
+   dev_info(adev->dev, "SYSVM: %lluM 0x%016llX - 0x%016llX\n",
+   mc->sysvm_size >> 20, mc->sysvm_start, mc->sysvm_end);

[PATCH 07/11] drm/amdgpu: rename GART to SYSVM

2017-07-03 Thread Christian König
From: Christian König 

Just mass rename all names related to the hardware GART/GTT functions to SYSVM.

The name of symbols related to the TTM TT domain stay the same.

This should improve the distinction between the two.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/Kconfig |   9 +-
 drivers/gpu/drm/amd/amdgpu/Makefile|   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  58 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  48 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 423 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c|   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.c  | 423 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_test.c   |  84 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|  76 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h|   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |  30 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  |   4 +-
 drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c   |  16 +-
 drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.h   |   4 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c  |  66 ++---
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  |  70 ++---
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  |  70 ++---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  |  66 ++---
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c|  16 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.h|   4 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c |   4 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c  |   8 +-
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c  |   4 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c  |   8 +-
 24 files changed, 749 insertions(+), 748 deletions(-)
 delete mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_sysvm.c

diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig 
b/drivers/gpu/drm/amd/amdgpu/Kconfig
index e8af1f5..ebbac01 100644
--- a/drivers/gpu/drm/amd/amdgpu/Kconfig
+++ b/drivers/gpu/drm/amd/amdgpu/Kconfig
@@ -31,14 +31,15 @@ config DRM_AMDGPU_USERPTR
  This option selects CONFIG_MMU_NOTIFIER if it isn't already
  selected to enabled full userptr support.
 
-config DRM_AMDGPU_GART_DEBUGFS
-   bool "Allow GART access through debugfs"
+config DRM_AMDGPU_SYSVM_DEBUGFS
+   bool "Allow SYSVM access through debugfs"
depends on DRM_AMDGPU
depends on DEBUG_FS
default n
help
- Selecting this option creates a debugfs file to inspect the mapped
- pages. Uses more memory for housekeeping, enable only for debugging.
+ Selecting this option creates a debugfs file to inspect the SYSVM
+ mapped pages. Uses more memory for housekeeping, enable only for
+ debugging.
 
 source "drivers/gpu/drm/amd/acp/Kconfig"
 source "drivers/gpu/drm/amd/display/Kconfig"
diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 3661110..d80d49f 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -22,7 +22,7 @@ amdgpu-y := amdgpu_drv.o
 # add KMS driver
 amdgpu-y += amdgpu_device.o amdgpu_kms.o \
amdgpu_atombios.o atombios_crtc.o amdgpu_connectors.o \
-   atom.o amdgpu_fence.o amdgpu_ttm.o amdgpu_object.o amdgpu_gart.o \
+   atom.o amdgpu_fence.o amdgpu_ttm.o amdgpu_object.o amdgpu_sysvm.o \
amdgpu_encoders.o amdgpu_display.o amdgpu_i2c.o \
amdgpu_fb.o amdgpu_gem.o amdgpu_ring.o \
amdgpu_cs.o amdgpu_bios.o amdgpu_benchmark.o amdgpu_test.o \
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 4a2b33d..abe191f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -305,7 +305,7 @@ struct amdgpu_vm_pte_funcs {
 };
 
 /* provided by the gmc block */
-struct amdgpu_gart_funcs {
+struct amdgpu_sysvm_funcs {
/* flush the vm tlb via mmio */
void (*flush_gpu_tlb)(struct amdgpu_device *adev,
  uint32_t vmid);
@@ -543,39 +543,39 @@ struct amdgpu_mc;
 #define AMDGPU_GPU_PAGE_SHIFT 12
 #define AMDGPU_GPU_PAGE_ALIGN(a) (((a) + AMDGPU_GPU_PAGE_MASK) & 
~AMDGPU_GPU_PAGE_MASK)
 
-struct amdgpu_gart {
+struct amdgpu_sysvm {
dma_addr_t  table_addr;
struct amdgpu_bo*robj;
void*ptr;
unsignednum_gpu_pages;
unsignednum_cpu_pages;
unsignedtable_size;
-#ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
+#ifdef CONFIG_DRM_AMDGPU_SYSVM_DEBUGFS
struct page **pages;
 #endif
boolready;
 
/* Asic default pte flags */
-   uint64_tgart_pte_flags;
+   uint64_tsysvm_pte_flags;
 
-   const struct amdgpu_gart_funcs *gart_funcs;
+   const struct 

[PATCH 04/11] drm/amdgpu: stop mapping BOs to GTT

2017-07-03 Thread Christian König
From: Christian König 

No need to map BOs to GTT on eviction and intermediate transfers any more.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 19 ++-
 1 file changed, 2 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 1fc9866..5c7a6c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -199,7 +199,6 @@ static void amdgpu_evict_flags(struct ttm_buffer_object *bo,
.lpfn = 0,
.flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_SYSTEM
};
-   unsigned i;
 
if (!amdgpu_ttm_bo_is_amdgpu_bo(bo)) {
placement->placement = 
@@ -217,20 +216,6 @@ static void amdgpu_evict_flags(struct ttm_buffer_object 
*bo,
amdgpu_ttm_placement_from_domain(abo, 
AMDGPU_GEM_DOMAIN_CPU);
} else {
amdgpu_ttm_placement_from_domain(abo, 
AMDGPU_GEM_DOMAIN_GTT);
-   for (i = 0; i < abo->placement.num_placement; ++i) {
-   if (!(abo->placements[i].flags &
- TTM_PL_FLAG_TT))
-   continue;
-
-   if (abo->placements[i].lpfn)
-   continue;
-
-   /* set an upper limit to force directly
-* allocating address space for the BO.
-*/
-   abo->placements[i].lpfn =
-   adev->mc.gtt_size >> PAGE_SHIFT;
-   }
}
break;
case TTM_PL_TT:
@@ -391,7 +376,7 @@ static int amdgpu_move_vram_ram(struct ttm_buffer_object 
*bo,
placement.num_busy_placement = 1;
placement.busy_placement = 
placements.fpfn = 0;
-   placements.lpfn = adev->mc.gtt_size >> PAGE_SHIFT;
+   placements.lpfn = 0;
placements.flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_TT;
r = ttm_bo_mem_space(bo, , _mem,
 interruptible, no_wait_gpu);
@@ -438,7 +423,7 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object 
*bo,
placement.num_busy_placement = 1;
placement.busy_placement = 
placements.fpfn = 0;
-   placements.lpfn = adev->mc.gtt_size >> PAGE_SHIFT;
+   placements.lpfn = 0;
placements.flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_TT;
r = ttm_bo_mem_space(bo, , _mem,
 interruptible, no_wait_gpu);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 01/11] drm/amdgpu: reserve the first 2x512 of GART

2017-07-03 Thread Christian König
From: Christian König 

We want to use them as remap address space.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 5 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 3 +++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index 1ef6255..f46a97d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -43,12 +43,15 @@ static int amdgpu_gtt_mgr_init(struct ttm_mem_type_manager 
*man,
   unsigned long p_size)
 {
struct amdgpu_gtt_mgr *mgr;
+   uint64_t start, size;
 
mgr = kzalloc(sizeof(*mgr), GFP_KERNEL);
if (!mgr)
return -ENOMEM;
 
-   drm_mm_init(>mm, 0, p_size);
+   start = AMDGPU_GTT_MAX_TRANSFER_SIZE * AMDGPU_GTT_NUM_TRANSFER_WINDOWS;
+   size = p_size - start;
+   drm_mm_init(>mm, start, size);
spin_lock_init(>lock);
mgr->available = p_size;
man->priv = mgr;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
index 776a20a..c8059f0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -34,6 +34,9 @@
 #define AMDGPU_PL_FLAG_GWS (TTM_PL_FLAG_PRIV << 1)
 #define AMDGPU_PL_FLAG_OA  (TTM_PL_FLAG_PRIV << 2)
 
+#define AMDGPU_GTT_MAX_TRANSFER_SIZE   512
+#define AMDGPU_GTT_NUM_TRANSFER_WINDOWS2
+
 struct amdgpu_mman {
struct ttm_bo_global_refbo_global_ref;
struct drm_global_reference mem_global_ref;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 02/11] drm/amdgpu: add amdgpu_gart_map function v2

2017-07-03 Thread Christian König
From: Christian König 

This allows us to write the mapped PTEs into
an IB instead of the table directly.

v2: fix build with debugfs enabled, remove unused assignment

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h  |  3 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 62 
 2 files changed, 51 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 810796a..4a2b33d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -572,6 +572,9 @@ int amdgpu_gart_init(struct amdgpu_device *adev);
 void amdgpu_gart_fini(struct amdgpu_device *adev);
 int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
int pages);
+int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
+   int pages, dma_addr_t *dma_addr, uint64_t flags,
+   void *dst);
 int amdgpu_gart_bind(struct amdgpu_device *adev, uint64_t offset,
 int pages, struct page **pagelist,
 dma_addr_t *dma_addr, uint64_t flags);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index 8877015..c808388 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -280,6 +280,41 @@ int amdgpu_gart_unbind(struct amdgpu_device *adev, 
uint64_t offset,
 }
 
 /**
+ * amdgpu_gart_map - map dma_addresses into GART entries
+ *
+ * @adev: amdgpu_device pointer
+ * @offset: offset into the GPU's gart aperture
+ * @pages: number of pages to bind
+ * @dma_addr: DMA addresses of pages
+ *
+ * Map the dma_addresses into GART entries (all asics).
+ * Returns 0 for success, -EINVAL for failure.
+ */
+int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
+   int pages, dma_addr_t *dma_addr, uint64_t flags,
+   void *dst)
+{
+   uint64_t page_base;
+   unsigned i, j, t;
+
+   if (!adev->gart.ready) {
+   WARN(1, "trying to bind memory to uninitialized GART !\n");
+   return -EINVAL;
+   }
+
+   t = offset / AMDGPU_GPU_PAGE_SIZE;
+
+   for (i = 0; i < pages; i++) {
+   page_base = dma_addr[i];
+   for (j = 0; j < (PAGE_SIZE / AMDGPU_GPU_PAGE_SIZE); j++, t++) {
+   amdgpu_gart_set_pte_pde(adev, dst, t, page_base, flags);
+   page_base += AMDGPU_GPU_PAGE_SIZE;
+   }
+   }
+   return 0;
+}
+
+/**
  * amdgpu_gart_bind - bind pages into the gart page table
  *
  * @adev: amdgpu_device pointer
@@ -296,31 +331,30 @@ int amdgpu_gart_bind(struct amdgpu_device *adev, uint64_t 
offset,
 int pages, struct page **pagelist, dma_addr_t *dma_addr,
 uint64_t flags)
 {
-   unsigned t;
-   unsigned p;
-   uint64_t page_base;
-   int i, j;
+#ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
+   unsigned i,t,p;
+#endif
+   int r;
 
if (!adev->gart.ready) {
WARN(1, "trying to bind memory to uninitialized GART !\n");
return -EINVAL;
}
 
+#ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
t = offset / AMDGPU_GPU_PAGE_SIZE;
p = t / (PAGE_SIZE / AMDGPU_GPU_PAGE_SIZE);
-
-   for (i = 0; i < pages; i++, p++) {
-#ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
+   for (i = 0; i < pages; i++, p++)
adev->gart.pages[p] = pagelist[i];
 #endif
-   if (adev->gart.ptr) {
-   page_base = dma_addr[i];
-   for (j = 0; j < (PAGE_SIZE / AMDGPU_GPU_PAGE_SIZE); 
j++, t++) {
-   amdgpu_gart_set_pte_pde(adev, adev->gart.ptr, 
t, page_base, flags);
-   page_base += AMDGPU_GPU_PAGE_SIZE;
-   }
-   }
+
+   if (adev->gart.ptr) {
+   r = amdgpu_gart_map(adev, offset, pages, dma_addr, flags,
+   adev->gart.ptr);
+   if (r)
+   return r;
}
+
mb();
amdgpu_gart_flush_gpu_tlb(adev, 0);
return 0;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 05/11] drm/amdgpu: remove maximum BO size limitation v2

2017-07-03 Thread Christian König
From: Christian König 

We can finally remove this now.

v2: remove now unused max_size variable as well.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 12 
 1 file changed, 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 96c4493..917ac5e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -49,7 +49,6 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, 
unsigned long size,
struct drm_gem_object **obj)
 {
struct amdgpu_bo *robj;
-   unsigned long max_size;
int r;
 
*obj = NULL;
@@ -58,17 +57,6 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, 
unsigned long size,
alignment = PAGE_SIZE;
}
 
-   if (!(initial_domain & (AMDGPU_GEM_DOMAIN_GDS | AMDGPU_GEM_DOMAIN_GWS | 
AMDGPU_GEM_DOMAIN_OA))) {
-   /* Maximum bo size is the unpinned gtt size since we use the 
gtt to
-* handle vram to system pool migrations.
-*/
-   max_size = adev->mc.gtt_size - adev->gart_pin_size;
-   if (size > max_size) {
-   DRM_DEBUG("Allocation size %ldMb bigger than %ldMb 
limit\n",
- size >> 20, max_size >> 20);
-   return -ENOMEM;
-   }
-   }
 retry:
r = amdgpu_bo_create(adev, size, alignment, kernel, initial_domain,
 flags, NULL, NULL, );
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 06/11] drm/amdgpu: use TTM values instead of MC values for the info queries

2017-07-03 Thread Christian König
From: Christian König 

Use the TTM values instead of the hardware config here.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 00ef2fc..7a8da32 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -484,7 +484,8 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void 
*data, struct drm_file
vram_gtt.vram_size -= adev->vram_pin_size;
vram_gtt.vram_cpu_accessible_size = adev->mc.visible_vram_size;
vram_gtt.vram_cpu_accessible_size -= (adev->vram_pin_size - 
adev->invisible_pin_size);
-   vram_gtt.gtt_size  = adev->mc.gtt_size;
+   vram_gtt.gtt_size = adev->mman.bdev.man[TTM_PL_TT].size;
+   vram_gtt.gtt_size *= PAGE_SIZE;
vram_gtt.gtt_size -= adev->gart_pin_size;
return copy_to_user(out, _gtt,
min((size_t)size, sizeof(vram_gtt))) ? 
-EFAULT : 0;
@@ -509,9 +510,10 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void 
*data, struct drm_file
mem.cpu_accessible_vram.max_allocation =
mem.cpu_accessible_vram.usable_heap_size * 3 / 4;
 
-   mem.gtt.total_heap_size = adev->mc.gtt_size;
-   mem.gtt.usable_heap_size =
-   adev->mc.gtt_size - adev->gart_pin_size;
+   mem.gtt.total_heap_size = adev->mman.bdev.man[TTM_PL_TT].size;
+   mem.gtt.total_heap_size *= PAGE_SIZE;
+   mem.gtt.usable_heap_size = mem.gtt.total_heap_size
+   - adev->gart_pin_size;
mem.gtt.heap_usage = atomic64_read(>gtt_usage);
mem.gtt.max_allocation = mem.gtt.usable_heap_size * 3 / 4;
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm: amd: amdgpu: constify ttm_place structures.

2017-07-03 Thread Christian König

Am 02.07.2017 um 11:13 schrieb Arvind Yadav:

ttm_place are not supposed to change at runtime. All functions
working with ttm_place provided by  work
with const ttm_place. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 5db0230..a2c8380 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -186,7 +186,7 @@ static void amdgpu_evict_flags(struct ttm_buffer_object *bo,
  {
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
struct amdgpu_bo *abo;
-   static struct ttm_place placements = {
+   static const struct ttm_place placements = {
.fpfn = 0,
.lpfn = 0,
.flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_SYSTEM



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm: radeon: radeon_ttm: constify ttm_place structures.

2017-07-03 Thread Christian König

Am 02.07.2017 um 11:06 schrieb Arvind Yadav:

ttm_place are not supposed to change at runtime. All functions
working with ttm_place provided by  work
with const ttm_place. So mark the non-const structs as const.

File size before:
text   data bss dec hex filename
9235344 136971525f3 
drivers/gpu/drm/radeon/radeon_ttm.o

File size After adding 'const':
text   data bss dec hex filename
9267312 136971525f3 
drivers/gpu/drm/radeon/radeon_ttm.o

Signed-off-by: Arvind Yadav 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/radeon/radeon_ttm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
b/drivers/gpu/drm/radeon/radeon_ttm.c
index 8b7623b..6499832 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -178,7 +178,7 @@ static int radeon_init_mem_type(struct ttm_bo_device *bdev, 
uint32_t type,
  static void radeon_evict_flags(struct ttm_buffer_object *bo,
struct ttm_placement *placement)
  {
-   static struct ttm_place placements = {
+   static const struct ttm_place placements = {
.fpfn = 0,
.lpfn = 0,
.flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_SYSTEM



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm: radeon: constify drm_prop_enum_list structures.

2017-07-03 Thread Christian König

Am 01.07.2017 um 11:47 schrieb Arvind Yadav:

drm_prop_enum_lists are not supposed to change at runtime. All functions
working with drm_prop_enum_list provided by  work with
const drm_prop_enum_list. So mark the non-const structs as const.

File size before:
text   data bss dec hex filename
   18276384   0   1866048e4 
drivers/gpu/drm/radeon/radeon_display.o

File size After adding 'const':
text   data bss dec hex filename
   18660  0   0   1866048e4 
drivers/gpu/drm/radeon/radeon_display.o

Signed-off-by: Arvind Yadav 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/radeon/radeon_display.c | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index 17d3daf..f339c1c 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -1388,12 +1388,12 @@ static const struct drm_mode_config_funcs 
radeon_mode_funcs = {
.output_poll_changed = radeon_output_poll_changed
  };
  
-static struct drm_prop_enum_list radeon_tmds_pll_enum_list[] =

+static const struct drm_prop_enum_list radeon_tmds_pll_enum_list[] =
  { { 0, "driver" },
{ 1, "bios" },
  };
  
-static struct drm_prop_enum_list radeon_tv_std_enum_list[] =

+static const struct drm_prop_enum_list radeon_tv_std_enum_list[] =
  { { TV_STD_NTSC, "ntsc" },
{ TV_STD_PAL, "pal" },
{ TV_STD_PAL_M, "pal-m" },
@@ -1404,25 +1404,25 @@ static struct drm_prop_enum_list 
radeon_tv_std_enum_list[] =
{ TV_STD_SECAM, "secam" },
  };
  
-static struct drm_prop_enum_list radeon_underscan_enum_list[] =

+static const struct drm_prop_enum_list radeon_underscan_enum_list[] =
  { { UNDERSCAN_OFF, "off" },
{ UNDERSCAN_ON, "on" },
{ UNDERSCAN_AUTO, "auto" },
  };
  
-static struct drm_prop_enum_list radeon_audio_enum_list[] =

+static const struct drm_prop_enum_list radeon_audio_enum_list[] =
  { { RADEON_AUDIO_DISABLE, "off" },
{ RADEON_AUDIO_ENABLE, "on" },
{ RADEON_AUDIO_AUTO, "auto" },
  };
  
  /* XXX support different dither options? spatial, temporal, both, etc. */

-static struct drm_prop_enum_list radeon_dither_enum_list[] =
+static const struct drm_prop_enum_list radeon_dither_enum_list[] =
  { { RADEON_FMT_DITHER_DISABLE, "off" },
{ RADEON_FMT_DITHER_ENABLE, "on" },
  };
  
-static struct drm_prop_enum_list radeon_output_csc_enum_list[] =

+static const struct drm_prop_enum_list radeon_output_csc_enum_list[] =
  { { RADEON_OUTPUT_CSC_BYPASS, "bypass" },
{ RADEON_OUTPUT_CSC_TVRGB, "tvrgb" },
{ RADEON_OUTPUT_CSC_YCBCR601, "ycbcr601" },



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH RFC 2/2] drm/amdgpu: Set/clear CPU_ACCESS flag on page fault and move to VRAM

2017-07-03 Thread Christian König

Am 03.07.2017 um 03:34 schrieb Michel Dänzer:

On 02/07/17 09:52 PM, Christian König wrote:

Am 30.06.2017 um 17:18 schrieb John Brooks:

When a BO is moved to VRAM, clear AMDGPU_BO_FLAG_CPU_ACCESS. This
allows it
to potentially later move to invisible VRAM if the CPU does not access it
again.

Setting the CPU_ACCESS flag in amdgpu_fault_reserve_notify() also means
that we can remove the loop to restrict lpfn to the end of visible VRAM,
because amdgpu_ttm_placement_init() will do it for us.

Signed-off-by: John Brooks 

[...]


@@ -446,6 +448,12 @@ static int amdgpu_move_ram_vram(struct
ttm_buffer_object *bo,
   if (unlikely(r)) {
   goto out_cleanup;
   }
+
+/* The page fault handler will re-set this if the CPU accesses
the BO
+ * after it's moved.
+ */

Maybe say "amdgpu_bo_fault_reserve_notify" explicitly here instead of
"The page fault handler".



+abo->flags &= ~AMDGPU_BO_FLAG_CPU_ACCESS;
+

This is the wrong place for clearing the flag. This code path is only
called when we move things back in after suspend/resume (or run out of
GTT space).

Surely amdgpu_move_ram_vram is called whenever a BO is moved to VRAM,


No, that isn't even remotely correct. amdgpu_move_ram_vram() is only 
called when the BO is moved directly from the system domain to the VRAM 
domain.


Normally BOs are only moved from the GTT domain to the VRAM domain, 
except after resume and when we ran out of GTT space.



for any reason. I suggested clearing the flag here to John on IRC. The
idea is briefly described in the commit log, let me elaborate a bit on that:

When a BO is moved to VRAM which has the AMDGPU_BO_FLAG_CPU_ACCESS flag
set, it is put in CPU visible VRAM, and the flag is cleared. If the CPU
doesn't access the BO, the next time it will be moved to VRAM (after it
was evicted from there, for any reason), the flag will no longer be set,
and the BO will likely be moved to CPU invisible VRAM.

If the BO is accessed by the CPU again though (no matter where the BO is
currently located at that time), the flag is set again, and the cycle
from the previous paragraph starts over.

The end result should be similar as with the timestamp based solution in
John's earlier series: BOs which are at least occasionally accessed by
the CPU will tend to be in CPU visible VRAM, those which are never
accessed by the CPU can be in CPU invisible VRAM.
Yeah, I understand the intention. But the implementation isn't even 
remotely correct.


First of all the flag must be cleared in the CS which wants to move the 
BO, not in the move functions when the decision where to put it is 
already made.


Second currently the flag is set on page fault, but never cleared 
because the place where the code to clear it was added is just 
completely incorrect (see above).


Instead of messing with all this I suggest that we just add a jiffies 
based timeout to the BO when we can clear the flag. For kernel BOs this 
timeout is just infinity.


Then we check in amdgpu_cs_bo_validate() before generating the 
placements if we could clear the flag and do so based on the timeout.


I can help implementing this when I'm done getting ride of the BO move 
size limitation (swapped all of this stuff for that task back into my 
brain anyway).


Regards,
Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx