Re: [PATCH 3/9] drm/amdgpu: Don't force BOs into visible VRAM for page faults

2017-06-26 Thread John Brooks
On Mon, Jun 26, 2017 at 06:38:29PM +0900, Michel Dänzer wrote:
> On 24/06/17 02:39 AM, John Brooks wrote:
> > There is no need for page faults to force BOs into visible VRAM if it's
> > full, and the time it takes to do so is great enough to cause noticeable
> > stuttering. Add GTT as a possible placement so that if visible VRAM is
> > full, page faults move BOs to GTT instead of evicting other BOs from VRAM.
> > 
> > Signed-off-by: John Brooks 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 14 ++
> >  1 file changed, 6 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > index 244a7d3..751bc05 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > @@ -981,10 +981,11 @@ int amdgpu_bo_fault_reserve_notify(struct 
> > ttm_buffer_object *bo)
> >  
> > /* hurrah the memory is not visible ! */
> > atomic64_inc(>num_vram_cpu_page_faults);
> > -   amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM);
> > +   amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM |
> > +AMDGPU_GEM_DOMAIN_GTT);
> > lpfn =  adev->mc.visible_vram_size >> PAGE_SHIFT;
> > for (i = 0; i < abo->placement.num_placement; i++) {
> > -   /* Force into visible VRAM */
> > +   /* Try to move the BO into visible VRAM */
> > if ((abo->placements[i].flags & TTM_PL_FLAG_VRAM) &&
> > (!abo->placements[i].lpfn ||
> >  abo->placements[i].lpfn > lpfn))
> > @@ -993,16 +994,13 @@ int amdgpu_bo_fault_reserve_notify(struct 
> > ttm_buffer_object *bo)
> > abo->placement.busy_placement = abo->placement.placement;
> > abo->placement.num_busy_placement = abo->placement.num_placement;
> > r = ttm_bo_validate(bo, >placement, false, false);
> > -   if (unlikely(r == -ENOMEM)) {
> > -   amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_GTT);
> > -   return ttm_bo_validate(bo, >placement, false, false);
> > -   } else if (unlikely(r != 0)) {
> > +   if (unlikely(r != 0))
> > return r;
> > -   }
> >  
> > offset = bo->mem.start << PAGE_SHIFT;
> > /* this should never happen */
> > -   if ((offset + size) > adev->mc.visible_vram_size)
> > +   if (bo->mem.mem_type == TTM_PL_VRAM &&
> > +   (offset + size) > adev->mc.visible_vram_size)
> > return -EINVAL;
> >  
> > return 0;
> > 
> 
> AFAICT this is essentially the same as
> https://patchwork.freedesktop.org/patch/156993/ . I retracted that patch
> due to it causing Rally Dirt performance degradation for Marek.
> Presumably other patches in this series compensate for that, but at the
> least this patch should be moved to the end of the series.

Yeah, it did end up pretty much the same as your patch :p

The reason this causes the performance degredation is that if a page fault
moves a BO to GTT, and it's *not* marked CPU_ACCESS_REQUIRED, then CS will move
it to invisible VRAM. And then another page fault happens. And this repeats
fast enough that the BO is constantly moving back and forth between GTT and
VRAM.

Patch 6 (Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS) takes care of
it by not allowing BOs to go to invisible VRAM until after a timeout and only
if it didn't have a page fault too soon after the last time it was moved to
VRAM. This was implemented by setting the CPU_ACCESS_REQUIRED flag on page
faults and not clearing it unless those conditions are met. It doesn't
necessarily need to involve the flag, depending on where we decide to go with
that.

John

> 
> 
> -- 
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] amdgpu: sync amdgpu_drm with kernel.

2017-06-26 Thread Michel Dänzer
On 27/06/17 11:58 AM, Dave Airlie wrote:
> From: Dave Airlie 
> 
> This syncs the amdgpu_drm header with my drm-next branch as of
> 6d61e70ccc21606ffb8a0a03bd3aba24f659502b.
> 
> It brings over the VM and semaphore API changes.
> 
> Generated using make headers_install.
> Generated from git://people.freedesktop.org/~airlied/linux drm-next commit 
> 6d61e70ccc2.
> 
> Signed-off-by: Dave Airlie 

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH libdrm] libdrm/amdgpu: add interface for kernel semaphores

2017-06-26 Thread Dave Airlie
Ignore this, all kinds of patches from wrong tree stuff going on here.

Dave.

On 27 June 2017 at 07:19, Dave Airlie  wrote:
> From: Dave Airlie 
>
> This adds the corresponding code for libdrm to use the new
> kernel interfaces for semaphores.
>
> This will be used by radv to implement shared semaphores.
>
> TODO: Version checks.
>
> Signed-off-by: Dave Airlie 
> ---
>  amdgpu/amdgpu.h  |  28 +
>  amdgpu/amdgpu_cs.c   | 161 
> ---
>  include/drm/amdgpu_drm.h |  28 +
>  3 files changed, 208 insertions(+), 9 deletions(-)
>
> diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
> index 7b26a04..747e248 100644
> --- a/amdgpu/amdgpu.h
> +++ b/amdgpu/amdgpu.h
> @@ -129,6 +129,8 @@ typedef struct amdgpu_va *amdgpu_va_handle;
>   */
>  typedef struct amdgpu_semaphore *amdgpu_semaphore_handle;
>
> +typedef uint32_t amdgpu_sem_handle;
> +
>  
> /*--*/
>  /* -- Structures -- 
> */
>  
> /*--*/
> @@ -365,6 +367,16 @@ struct amdgpu_cs_request {
> struct amdgpu_cs_fence_info fence_info;
>  };
>
> +struct amdgpu_cs_request_sem {
> +   /*
> +*
> +*/
> +   uint32_t number_of_wait_sem;
> +   uint32_t *wait_sems;
> +   uint32_t number_of_signal_sem;
> +   uint32_t *signal_sems;
> +};
> +
>  /**
>   * Structure which provide information about GPU VM MC Address space
>   * alignments requirements
> @@ -882,6 +894,12 @@ int amdgpu_cs_submit(amdgpu_context_handle context,
>  struct amdgpu_cs_request *ibs_request,
>  uint32_t number_of_requests);
>
> +int amdgpu_cs_submit_sem(amdgpu_context_handle context,
> +uint64_t flags,
> +struct amdgpu_cs_request *ibs_request,
> +struct amdgpu_cs_request_sem *ibs_sem,
> +uint32_t number_of_requests);
> +
>  /**
>   *  Query status of Command Buffer Submission
>   *
> @@ -1255,4 +1273,14 @@ int 
> amdgpu_cs_destroy_semaphore(amdgpu_semaphore_handle sem);
>  */
>  const char *amdgpu_get_marketing_name(amdgpu_device_handle dev);
>
> +int amdgpu_cs_create_sem(amdgpu_device_handle dev,
> +amdgpu_sem_handle *sem);
> +int amdgpu_cs_export_sem(amdgpu_device_handle dev,
> + amdgpu_sem_handle sem,
> +int *shared_handle);
> +int amdgpu_cs_import_sem(amdgpu_device_handle dev,
> + int shared_handle,
> +amdgpu_sem_handle *sem);
> +int amdgpu_cs_destroy_sem(amdgpu_device_handle dev,
> + amdgpu_sem_handle sem);
>  #endif /* #ifdef _AMDGPU_H_ */
> diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c
> index fb5b3a8..7283327 100644
> --- a/amdgpu/amdgpu_cs.c
> +++ b/amdgpu/amdgpu_cs.c
> @@ -170,7 +170,8 @@ int amdgpu_cs_query_reset_state(amdgpu_context_handle 
> context,
>   * \sa amdgpu_cs_submit()
>  */
>  static int amdgpu_cs_submit_one(amdgpu_context_handle context,
> -   struct amdgpu_cs_request *ibs_request)
> +   struct amdgpu_cs_request *ibs_request,
> +   struct amdgpu_cs_request_sem *sem_request)
>  {
> union drm_amdgpu_cs cs;
> uint64_t *chunk_array;
> @@ -178,9 +179,11 @@ static int amdgpu_cs_submit_one(amdgpu_context_handle 
> context,
> struct drm_amdgpu_cs_chunk_data *chunk_data;
> struct drm_amdgpu_cs_chunk_dep *dependencies = NULL;
> struct drm_amdgpu_cs_chunk_dep *sem_dependencies = NULL;
> +   struct drm_amdgpu_cs_chunk_sem *wait_sem_dependencies = NULL;
> +   struct drm_amdgpu_cs_chunk_sem *signal_sem_dependencies = NULL;
> struct list_head *sem_list;
> amdgpu_semaphore_handle sem, tmp;
> -   uint32_t i, size, sem_count = 0;
> +   uint32_t i, j, size, sem_count = 0;
> bool user_fence;
> int r = 0;
>
> @@ -196,7 +199,7 @@ static int amdgpu_cs_submit_one(amdgpu_context_handle 
> context,
> }
> user_fence = (ibs_request->fence_info.handle != NULL);
>
> -   size = ibs_request->number_of_ibs + (user_fence ? 2 : 1) + 1;
> +   size = ibs_request->number_of_ibs + (user_fence ? 2 : 1) + 1 + 
> (sem_request ? 2 : 0);
>
> chunk_array = alloca(sizeof(uint64_t) * size);
> chunks = alloca(sizeof(struct drm_amdgpu_cs_chunk) * size);
> @@ -308,6 +311,45 @@ static int amdgpu_cs_submit_one(amdgpu_context_handle 
> context,
> chunks[i].chunk_data = (uint64_t)(uintptr_t)sem_dependencies;
> }
>
> +   if (sem_request) {
> +   if (sem_request->number_of_wait_sem) {
> +   wait_sem_dependencies = 

[PATCH] amdgpu: sync amdgpu_drm with kernel.

2017-06-26 Thread Dave Airlie
From: Dave Airlie 

This syncs the amdgpu_drm header with my drm-next branch as of
6d61e70ccc21606ffb8a0a03bd3aba24f659502b.

It brings over the VM and semaphore API changes.

Generated using make headers_install.
Generated from git://people.freedesktop.org/~airlied/linux drm-next commit 
6d61e70ccc2.

Signed-off-by: Dave Airlie 
---
 include/drm/amdgpu_drm.h | 54 +++-
 1 file changed, 53 insertions(+), 1 deletion(-)

diff --git a/include/drm/amdgpu_drm.h b/include/drm/amdgpu_drm.h
index 8cfe68c..d9aa4a3 100644
--- a/include/drm/amdgpu_drm.h
+++ b/include/drm/amdgpu_drm.h
@@ -51,6 +51,7 @@ extern "C" {
 #define DRM_AMDGPU_GEM_OP  0x10
 #define DRM_AMDGPU_GEM_USERPTR 0x11
 #define DRM_AMDGPU_WAIT_FENCES 0x12
+#define DRM_AMDGPU_VM  0x13
 
 #define DRM_IOCTL_AMDGPU_GEM_CREATEDRM_IOWR(DRM_COMMAND_BASE + 
DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
 #define DRM_IOCTL_AMDGPU_GEM_MMAP  DRM_IOWR(DRM_COMMAND_BASE + 
DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
@@ -65,6 +66,7 @@ extern "C" {
 #define DRM_IOCTL_AMDGPU_GEM_OPDRM_IOWR(DRM_COMMAND_BASE + 
DRM_AMDGPU_GEM_OP, struct drm_amdgpu_gem_op)
 #define DRM_IOCTL_AMDGPU_GEM_USERPTR   DRM_IOWR(DRM_COMMAND_BASE + 
DRM_AMDGPU_GEM_USERPTR, struct drm_amdgpu_gem_userptr)
 #define DRM_IOCTL_AMDGPU_WAIT_FENCES   DRM_IOWR(DRM_COMMAND_BASE + 
DRM_AMDGPU_WAIT_FENCES, union drm_amdgpu_wait_fences)
+#define DRM_IOCTL_AMDGPU_VMDRM_IOWR(DRM_COMMAND_BASE + 
DRM_AMDGPU_VM, union drm_amdgpu_vm)
 
 #define AMDGPU_GEM_DOMAIN_CPU  0x1
 #define AMDGPU_GEM_DOMAIN_GTT  0x2
@@ -190,6 +192,26 @@ union drm_amdgpu_ctx {
union drm_amdgpu_ctx_out out;
 };
 
+/* vm ioctl */
+#define AMDGPU_VM_OP_RESERVE_VMID  1
+#define AMDGPU_VM_OP_UNRESERVE_VMID2
+
+struct drm_amdgpu_vm_in {
+   /** AMDGPU_VM_OP_* */
+   __u32   op;
+   __u32   flags;
+};
+
+struct drm_amdgpu_vm_out {
+   /** For future use, no flags defined so far */
+   __u64   flags;
+};
+
+union drm_amdgpu_vm {
+   struct drm_amdgpu_vm_in in;
+   struct drm_amdgpu_vm_out out;
+};
+
 /*
  * This is not a reliable API and you should expect it to fail for any
  * number of reasons and have fallback path that do not use userptr to
@@ -295,7 +317,10 @@ union drm_amdgpu_gem_wait_idle {
 };
 
 struct drm_amdgpu_wait_cs_in {
-   /** Command submission handle */
+   /* Command submission handle
+ * handle equals 0 means none to wait for
+ * handle equals ~0ull means wait for the latest sequence number
+ */
__u64 handle;
/** Absolute timeout to wait */
__u64 timeout;
@@ -415,6 +440,8 @@ struct drm_amdgpu_gem_va {
 #define AMDGPU_CHUNK_ID_IB 0x01
 #define AMDGPU_CHUNK_ID_FENCE  0x02
 #define AMDGPU_CHUNK_ID_DEPENDENCIES   0x03
+#define AMDGPU_CHUNK_ID_SYNCOBJ_IN  0x04
+#define AMDGPU_CHUNK_ID_SYNCOBJ_OUT 0x05
 
 struct drm_amdgpu_cs_chunk {
__u32   chunk_id;
@@ -482,6 +509,10 @@ struct drm_amdgpu_cs_chunk_fence {
__u32 offset;
 };
 
+struct drm_amdgpu_cs_chunk_sem {
+   __u32 handle;
+};
+
 struct drm_amdgpu_cs_chunk_data {
union {
struct drm_amdgpu_cs_chunk_ib   ib_data;
@@ -578,6 +609,8 @@ struct drm_amdgpu_cs_chunk_data {
#define AMDGPU_INFO_SENSOR_VDDNB0x6
/* Subquery id: Query graphics voltage */
#define AMDGPU_INFO_SENSOR_VDDGFX   0x7
+/* Number of VRAM page faults on CPU access. */
+#define AMDGPU_INFO_NUM_VRAM_CPU_PAGE_FAULTS   0x1E
 
 #define AMDGPU_INFO_MMR_SE_INDEX_SHIFT 0
 #define AMDGPU_INFO_MMR_SE_INDEX_MASK  0xff
@@ -766,6 +799,25 @@ struct drm_amdgpu_info_device {
__u64 cntl_sb_buf_gpu_addr;
/* NGG Parameter Cache */
__u64 param_buf_gpu_addr;
+   __u32 prim_buf_size;
+   __u32 pos_buf_size;
+   __u32 cntl_sb_buf_size;
+   __u32 param_buf_size;
+   /* wavefront size*/
+   __u32 wave_front_size;
+   /* shader visible vgprs*/
+   __u32 num_shader_visible_vgprs;
+   /* CU per shader array*/
+   __u32 num_cu_per_sh;
+   /* number of tcc blocks*/
+   __u32 num_tcc_blocks;
+   /* gs vgt table depth*/
+   __u32 gs_vgt_table_depth;
+   /* gs primitive buffer depth*/
+   __u32 gs_prim_buffer_depth;
+   /* max gs wavefront per vgt*/
+   __u32 max_gs_waves_per_vgt;
+   __u32 _pad1;
 };
 
 struct drm_amdgpu_info_hw_ip {
-- 
2.9.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 3/3] drm/amdgpu: add gtt_sys_limit

2017-06-26 Thread Michel Dänzer
On 27/06/17 08:39 AM, John Brooks wrote:
> On Mon, Jun 26, 2017 at 03:39:57PM +0200, Christian König wrote:
>> From: Christian König 
>>
>> Limit the size of the GART table for the system domain.
>>
>> This saves us a bunch of visible VRAM, but also limitates the maximum BO 
>> size we can swap out.
>>
>> Signed-off-by: Christian König 
> 
> Hmm.
> 
> On my system, GTT is 4096MiB (1048576 pages). For this, the table takes up
> 1048576*8 bytes = 8MiB. Reducing GTT to 256MiB (65536 pages) would reduce the
> size of the table to 512 KiB. A relatively large saving, to be sure. But in 
> the
> grander scheme of things, is saving 7.5MiB (3% of visible VRAM @ 256M) worth
> cutting GTT memory by a factor of 16?

I'm afraid not, especially since it would limit the maximum BO size to <
256MB, if I understand correctly. Pretty sure that would cause failures
with real world apps.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v2 00/14] improve the fb_setcmap helper

2017-06-26 Thread Peter Rosin
On 2017-06-26 11:35, Daniel Vetter wrote:
> On Thu, Jun 22, 2017 at 08:06:23AM +0200, Peter Rosin wrote:
>> Hi!
>>
>> While trying to get CLUT support for the atmel_hlcdc driver, and
>> specifically for the emulated fbdev interface, I received some
>> push-back that my feeble in-driver attempts should be solved
>> by the core. This is my attempt to do it right.
>>
>> I have obviously not tested all of this with more than a compile,
>> but patches 1 and 3 are enough to make the atmel-hlcdc driver
>> do what I need (when patched to support CLUT modes). The rest is
>> just lots of removals and cleanup made possible by the improved
>> core.
>>
>> Please test, I would not be surprised if I have fouled up some
>> bit-manipulation somewhere in this mostly mechanical change...
>>
>> Changes since v1:
>>
>> - Rebased to next-20170621
>> - Split 1/11 into a preparatory patch, a cleanup patch and then
>>   the meat in 3/14.
>> - Handle pseudo-palette for FB_VISUAL_TRUECOLOR.
>> - Removed the empty .gamma_get/.gamma_set fb helpers from the
>>   armada driver that I had somehow managed to ignore but which
>>   0day found real quick.
>> - Be less judgemental on drivers only providing .gamma_get and
>>   .gamma_set, but no .load_lut. That's actually a valid thing
>>   to do if you only need pseudo-palette for FB_VISUAL_TRUECOLOR.
>> - Add a comment about colliding bitfields in the nouveau driver.
>> - Remove gamma_set/gamma_get declarations from the radeon driver
>>   (the definitions were removed in v1).
> 
> Ok some nits/questions on the first three, but in principle looks all ok I
> think. The driver patches also look good (but I didn't yet carefully
> review all the conversion). What we might want to do is entirely remove
> driver's reliance on ->gamma_store (mostly amounts to in-lining the
> load_lut functions) and only update ->gamma_store after gamma_set returned
> successfully. But that's a bit more work.
> 
> Save/restoring it instead might be simpler to fix that bug, but since it's
> pre-existing also ok as follow-up.

I'm traveling and cannot make progress this week. The merge window is
also real close so this series will therefore probably miss it unless
something unexpected happens...

I'll get back to this for the next cycle, just a heads up.

Cheers,
peda
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 3/3] drm/amdgpu: add gtt_sys_limit

2017-06-26 Thread John Brooks
On Mon, Jun 26, 2017 at 03:39:57PM +0200, Christian König wrote:
> From: Christian König 
> 
> Limit the size of the GART table for the system domain.
> 
> This saves us a bunch of visible VRAM, but also limitates the maximum BO size 
> we can swap out.
> 
> Signed-off-by: Christian König 

Hmm.

On my system, GTT is 4096MiB (1048576 pages). For this, the table takes up
1048576*8 bytes = 8MiB. Reducing GTT to 256MiB (65536 pages) would reduce the
size of the table to 512 KiB. A relatively large saving, to be sure. But in the
grander scheme of things, is saving 7.5MiB (3% of visible VRAM @ 256M) worth
cutting GTT memory by a factor of 16?

John

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  | 6 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c| 8 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 6 --
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 6 +++---
>  7 files changed, 26 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index ab1dad2..a511029 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -76,6 +76,7 @@
>  extern int amdgpu_modeset;
>  extern int amdgpu_vram_limit;
>  extern int amdgpu_gart_size;
> +extern unsigned amdgpu_gart_sys_limit;
>  extern int amdgpu_moverate;
>  extern int amdgpu_benchmarking;
>  extern int amdgpu_testing;
> @@ -602,6 +603,7 @@ struct amdgpu_mc {
>   u64 mc_vram_size;
>   u64 visible_vram_size;
>   u64 gtt_size;
> + u64 gtt_sys_limit;
>   u64 gtt_start;
>   u64 gtt_end;
>   u64 vram_start;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 44484bb..b6edb83 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1122,6 +1122,12 @@ static void amdgpu_check_arguments(struct 
> amdgpu_device *adev)
>   }
>   }
>  
> + if (amdgpu_gart_sys_limit < 32) {
> + dev_warn(adev->dev, "gart sys limit (%d) too small\n",
> +  amdgpu_gart_sys_limit);
> + amdgpu_gart_sys_limit = 32;
> + }
> +
>   amdgpu_check_vm_size(adev);
>  
>   amdgpu_check_block_size(adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 5a1d794..907ae5e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -75,6 +75,7 @@
>  
>  int amdgpu_vram_limit = 0;
>  int amdgpu_gart_size = -1; /* auto */
> +unsigned amdgpu_gart_sys_limit = 256;
>  int amdgpu_moverate = -1; /* auto */
>  int amdgpu_benchmarking = 0;
>  int amdgpu_testing = 0;
> @@ -124,6 +125,9 @@ module_param_named(vramlimit, amdgpu_vram_limit, int, 
> 0600);
>  MODULE_PARM_DESC(gartsize, "Size of PCIE/IGP gart to setup in megabytes (32, 
> 64, etc., -1 = auto)");
>  module_param_named(gartsize, amdgpu_gart_size, int, 0600);
>  
> +MODULE_PARM_DESC(gartlimit, "GART limit for the system domain in megabytes 
> (default 256)");
> +module_param_named(gartlimit, amdgpu_gart_sys_limit, int, 0600);
> +
>  MODULE_PARM_DESC(moverate, "Maximum buffer migration rate in MB/s. (32, 64, 
> etc., -1=auto, 0=1=disabled)");
>  module_param_named(moverate, amdgpu_moverate, int, 0600);
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index 8877015..5c6a461 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -70,6 +70,9 @@ void amdgpu_gart_set_defaults(struct amdgpu_device *adev)
>   adev->mc.mc_vram_size);
>   else
>   adev->mc.gtt_size = (uint64_t)amdgpu_gart_size << 20;
> +
> + adev->mc.gtt_sys_limit = min((uint64_t)amdgpu_gart_sys_limit << 20,
> +  adev->mc.gtt_size);
>  }
>  
>  /**
> @@ -350,8 +353,9 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
>   if (r)
>   return r;
>   /* Compute table size */
> - adev->gart.num_cpu_pages = adev->mc.gtt_size / PAGE_SIZE;
> - adev->gart.num_gpu_pages = adev->mc.gtt_size / AMDGPU_GPU_PAGE_SIZE;
> + adev->gart.num_cpu_pages = adev->mc.gtt_sys_limit / PAGE_SIZE;
> + adev->gart.num_gpu_pages = adev->mc.gtt_sys_limit /
> + AMDGPU_GPU_PAGE_SIZE;
>   DRM_INFO("GART: num cpu pages %u, num gpu pages %u\n",
>adev->gart.num_cpu_pages, adev->gart.num_gpu_pages);
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> 

Re: [PATCH 6/9] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS

2017-06-26 Thread Marek Olšák
On Mon, Jun 26, 2017 at 11:27 AM, Michel Dänzer  wrote:
> On 25/06/17 03:00 AM, Christian König wrote:
>> Am 23.06.2017 um 19:39 schrieb John Brooks:
>>> When the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag is given by
>>> userspace,
>>> it should only be treated as a hint to initially place a BO somewhere CPU
>>> accessible, rather than having a permanent effect on BO placement.
>>
>> And that is a clear NAK from my side.
>>
>> CPU_ACCESS_REQUIRED is a permanent limitation to where the buffer should
>> be placed.
>
> It really can't be more than a hint. The userspace driver cannot
> reliably know ahead of time whether a BO will be accessed by the CPU at
> all, let alone how often. A BO which incorrectly has this flag set
> creates artificial pressure on CPU visible VRAM.

I also think the flag is only a hint and shouldn't be taken too seriously.

Only AMDGPU_GEM_CREATE_NO_CPU_ACCESS has a strict behavior.

Marek
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 7/9] drm/amdgpu: Throttle visible VRAM moves separately

2017-06-26 Thread John Brooks
On Mon, Jun 26, 2017 at 06:44:30PM +0900, Michel Dänzer wrote:
> On 24/06/17 02:39 AM, John Brooks wrote:
> > The BO move throttling code is designed to allow VRAM to fill quickly if it
> > is relatively empty. However, this does not take into account situations
> > where the visible VRAM is smaller than total VRAM, and total VRAM may not
> > be close to full but the visible VRAM segment is under pressure. In such
> > situations, visible VRAM would experience unrestricted swapping and
> > performance would drop.
> > 
> > Add a separate counter specifically for moves involving visible VRAM, and
> > check it before moving BOs there.
> > 
> > Fixes: 95844d20ae02 (drm/amdgpu: throttle buffer migrations at CS using a 
> > fixed MBps limit (v2))
> > Signed-off-by: John Brooks 
> 
> Something like this patch is definitely needed, good catch.
> 
> However, as is one issue is that it incurs CPU overhead even when all of
> VRAM is CPU visible. Can that be avoided somehow?
> 

I guess that depends which part of it you consider to be expensive.

bytes_moved_vis_threshold isn't used unless visible VRAM is smaller than total
VRAM, so any work/instructions that go into computing it could be skipped in
that case (at the cost of checking that visible_vram_size < real_vram_size)
Would that help?

John

> 
> -- 
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH libdrm] libdrm/amdgpu: add interface for kernel semaphores

2017-06-26 Thread Dave Airlie
From: Dave Airlie 

This adds the corresponding code for libdrm to use the new
kernel interfaces for semaphores.

This will be used by radv to implement shared semaphores.

TODO: Version checks.

Signed-off-by: Dave Airlie 
---
 amdgpu/amdgpu.h  |  28 +
 amdgpu/amdgpu_cs.c   | 161 ---
 include/drm/amdgpu_drm.h |  28 +
 3 files changed, 208 insertions(+), 9 deletions(-)

diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
index 7b26a04..747e248 100644
--- a/amdgpu/amdgpu.h
+++ b/amdgpu/amdgpu.h
@@ -129,6 +129,8 @@ typedef struct amdgpu_va *amdgpu_va_handle;
  */
 typedef struct amdgpu_semaphore *amdgpu_semaphore_handle;
 
+typedef uint32_t amdgpu_sem_handle;
+
 /*--*/
 /* -- Structures -- */
 /*--*/
@@ -365,6 +367,16 @@ struct amdgpu_cs_request {
struct amdgpu_cs_fence_info fence_info;
 };
 
+struct amdgpu_cs_request_sem {
+   /*
+*
+*/
+   uint32_t number_of_wait_sem;
+   uint32_t *wait_sems;
+   uint32_t number_of_signal_sem;
+   uint32_t *signal_sems;
+};
+
 /**
  * Structure which provide information about GPU VM MC Address space
  * alignments requirements
@@ -882,6 +894,12 @@ int amdgpu_cs_submit(amdgpu_context_handle context,
 struct amdgpu_cs_request *ibs_request,
 uint32_t number_of_requests);
 
+int amdgpu_cs_submit_sem(amdgpu_context_handle context,
+uint64_t flags,
+struct amdgpu_cs_request *ibs_request,
+struct amdgpu_cs_request_sem *ibs_sem,
+uint32_t number_of_requests);
+
 /**
  *  Query status of Command Buffer Submission
  *
@@ -1255,4 +1273,14 @@ int amdgpu_cs_destroy_semaphore(amdgpu_semaphore_handle 
sem);
 */
 const char *amdgpu_get_marketing_name(amdgpu_device_handle dev);
 
+int amdgpu_cs_create_sem(amdgpu_device_handle dev,
+amdgpu_sem_handle *sem);
+int amdgpu_cs_export_sem(amdgpu_device_handle dev,
+ amdgpu_sem_handle sem,
+int *shared_handle);
+int amdgpu_cs_import_sem(amdgpu_device_handle dev,
+ int shared_handle,
+amdgpu_sem_handle *sem);
+int amdgpu_cs_destroy_sem(amdgpu_device_handle dev,
+ amdgpu_sem_handle sem);
 #endif /* #ifdef _AMDGPU_H_ */
diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c
index fb5b3a8..7283327 100644
--- a/amdgpu/amdgpu_cs.c
+++ b/amdgpu/amdgpu_cs.c
@@ -170,7 +170,8 @@ int amdgpu_cs_query_reset_state(amdgpu_context_handle 
context,
  * \sa amdgpu_cs_submit()
 */
 static int amdgpu_cs_submit_one(amdgpu_context_handle context,
-   struct amdgpu_cs_request *ibs_request)
+   struct amdgpu_cs_request *ibs_request,
+   struct amdgpu_cs_request_sem *sem_request)
 {
union drm_amdgpu_cs cs;
uint64_t *chunk_array;
@@ -178,9 +179,11 @@ static int amdgpu_cs_submit_one(amdgpu_context_handle 
context,
struct drm_amdgpu_cs_chunk_data *chunk_data;
struct drm_amdgpu_cs_chunk_dep *dependencies = NULL;
struct drm_amdgpu_cs_chunk_dep *sem_dependencies = NULL;
+   struct drm_amdgpu_cs_chunk_sem *wait_sem_dependencies = NULL;
+   struct drm_amdgpu_cs_chunk_sem *signal_sem_dependencies = NULL;
struct list_head *sem_list;
amdgpu_semaphore_handle sem, tmp;
-   uint32_t i, size, sem_count = 0;
+   uint32_t i, j, size, sem_count = 0;
bool user_fence;
int r = 0;
 
@@ -196,7 +199,7 @@ static int amdgpu_cs_submit_one(amdgpu_context_handle 
context,
}
user_fence = (ibs_request->fence_info.handle != NULL);
 
-   size = ibs_request->number_of_ibs + (user_fence ? 2 : 1) + 1;
+   size = ibs_request->number_of_ibs + (user_fence ? 2 : 1) + 1 + 
(sem_request ? 2 : 0);
 
chunk_array = alloca(sizeof(uint64_t) * size);
chunks = alloca(sizeof(struct drm_amdgpu_cs_chunk) * size);
@@ -308,6 +311,45 @@ static int amdgpu_cs_submit_one(amdgpu_context_handle 
context,
chunks[i].chunk_data = (uint64_t)(uintptr_t)sem_dependencies;
}
 
+   if (sem_request) {
+   if (sem_request->number_of_wait_sem) {
+   wait_sem_dependencies = malloc(sizeof(struct 
drm_amdgpu_cs_chunk_sem) * sem_request->number_of_wait_sem);
+   if (!wait_sem_dependencies) {
+   r = -ENOMEM;
+   goto error_unlock;
+   }
+   for (j = 0; j < sem_request->number_of_wait_sem; j++) {
+   struct 

RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control use of ECC/EDC.

2017-06-26 Thread Panariti, David


> -Original Message-
> From: Bridgman, John
> Sent: Monday, June 26, 2017 3:12 PM
> To: Xie, AlexBin ; Panariti, David
> ; Deucher, Alexander
> ; amd-gfx@lists.freedesktop.org
> Subject: RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control use
> of ECC/EDC.
> 
> Agreed... one person's "best" is another person's "OMG I didn't want that".
[davep] However, I've questioned the sanity of many default choices.
> IMO we should have bits correspond to specific options as much as possible,
> modulo HW capabilities.
[davep] Yes, we can all agree that the word BEST, wasn't.
You can throw tomatoes at me when we're all in Markham.
But it's specifically the "modulo HW capabilities" that makes me want to use a 
single value to specify BEST (ADVANCED, RELIABLEST, SAFEST, AAGGRESSIVE, 
PURPLE, COMPUTE, PROFESSIONAL, etc.) 
Please, suggestions for better than BEST.

To summarize, I propose:

DEFAULT: User need do nothing, no parameter.  As AlexBin suggested, this can be 
a conservative choice of options. Options defined per asic.
ADVANCED: More features than DEFAULT, but nothing known to break.  Things we're 
sure not everyone would want. Selected by a unique value e.g. (1 << 3).  
Options defined per asic.
ALL: All compatible features, mod mutually exclusive or otherwise incompatible 
to the HW.  Could be specified as 0x.
BITMASK: The ultimate catchall.  I'd even say no masking out of any bits, 
caveat usor.  Don't forget, we're users of this interface and who knows what 
will be useful for dev or testing.
NONE: ras_param=0

DEFAULT, ADVANCED, ALL and NONE could go into a 2 bit field since they're 
mutually exclusive.

Examples:
CZ and eCZ:
DEFAULT: everything except PROP_FED (halt ip/reboot)
ADVANCED: DEFAULT + PROP_FED.
ALL: Same as ADVANCED.
BITMASK: PROP_FED, no counters (who knows why, but we can do it)

Vega10:
DEFAULT: No ECC
ADVANCED: No ECC.
ALL: ECC
BITMASK: ECC, don't count on your results.

To me, the benefit of ADVANCED and ALL is that I, as a user, would want them to 
change with new drivers.  I want that logical behavior and don't want to check 
ChangeLogs to see what new bits I need.
I think, for example, an HPC customer would want to use ADVANCED because that 
is what we think is the most reliable. 
Basically they're just macros so in the most common cases, users don't need to 
worry about the bits.

Providing a bitmask argument allows anything to be overridden, and as an 
advanced user (such as a developer), system evaluator, etc, absolute 
flexibility is essential.

From AlexD earlier (I may have redundancies):
> BEST could be defined as
> 0x since each asic would only look at the masked bits for the 
> features it
> supports. 
[davep] This could cause problems if there are mutually exclusive or otherwise 
incompatible bits.  Then there would need to be code choosing one over the 
other which is what we'd need to do to define AGGRESSIVE anyway.
Also I've seen fields where 1 enables a feature and others where 1 disables a 
feature.
This is why I chose a single bit.  The driver will know what to do with BEST, 
especially if features are added for fixed. It's no more effort than specifying 
0xff...f
> I would prefer not to call it BEST though.  Maybe ALL.
[davep] See 0xfff..f for why ALL may not be BEST. Pun intended.

BTW:  I like the RAS_ prefix over ECC, since it is really the overarching 
concept.
> 
> RAS_NONE 0
> RAS_DEFAULT (1 << 0)
> RAS_VRAM_ECC (1 << 1)
> RAS_SRAM_EDC (1 << 2)
> ...
> RAS_ALL 0x

[davep] So I see this with a change to RAS_ALL to basically be equivalent to 
what I had considered BEST and should be a fixed bit such as (1 << 3), and we 
choose the options based on what we know at the time of a release.
0xff...f as a number as opposed to a bitmask could be used instead of (1 << 3)
> 
> >-Original Message-
> >From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> >Of Xie, AlexBin
> >Sent: Monday, June 26, 2017 2:12 PM
> >To: Panariti, David; Deucher, Alexander; amd-gfx@lists.freedesktop.org
> >Subject: RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control
> >use of ECC/EDC.
> >
> >Hi,
> >
> >I have not checked the background of this discussion very closely yet.
> >And you might have known the following.
> >
> >Customers may not want the default setting to change meaning. This is
> >like an API.
> >Example: The application and its environment is already set up and tested.
> >Then if customer updates driver, suddenly driver has some new behavior?
> >Certain serious application definitely does not accept this.
> >
> >IMHO, it is better to avoid vague concepts like "best". It will become
> >rather difficult to define what is best when there are multiple
> >customers with different requirements. Driver is to provide a feature or
> mechanism. "Best"
> >sounds like a policy or a preference from driver side.
> >
> >In my pass work, I generally use 

[PATCH 01/13] drm/amd/display: MST atomic_get_property missing.

2017-06-26 Thread Harry Wentland
From: Andrey Grodzovsky 

Missing function implementation was leading to EINVAL
in UMD thus not adding MST connector to X topology
and hence not getting set mode for it.

Change-Id: I913039185c3e1ab13605ef394a5e7c550025db4a
Signed-off-by: Andrey Grodzovsky 
Reviewed-by: Harry Wentland 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
index 5fc963deedf7..3349489d7991 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
@@ -170,7 +170,8 @@ static const struct drm_connector_funcs 
dm_dp_mst_connector_funcs = {
.set_property = drm_atomic_helper_connector_set_property,
.atomic_duplicate_state = amdgpu_dm_connector_atomic_duplicate_state,
.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
-   .atomic_set_property = amdgpu_dm_connector_atomic_set_property
+   .atomic_set_property = amdgpu_dm_connector_atomic_set_property,
+   .atomic_get_property = amdgpu_dm_connector_atomic_get_property
 };
 
 static int dm_dp_mst_get_modes(struct drm_connector *connector)
-- 
2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 02/13] drm/amd/display: Add global lock function.

2017-06-26 Thread Harry Wentland
From: Andrey Grodzovsky 

Change-Id: Ifc5e7c383c9be8fdda0989b9a91f8602487ccbb5
Signed-off-by: Andrey Grodzovsky 
Reviewed-by: Harry Wentland 
---
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_types.c| 66 --
 1 file changed, 49 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
index 038a94e4fe18..5d8bb9c2fa4a 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
@@ -3021,6 +3021,49 @@ static enum surface_update_type  
amdgpu_dm_check_surfaces_update_type(
return update_type;
 }
 
+/*`
+ * Grabs all modesetting locks to serialize against any blocking commits,
+ * Waits for completion of all non blocking commits.
+ */
+static void aquire_global_lock(
+   struct drm_device *dev,
+   struct drm_atomic_state *state)
+{
+   struct drm_crtc *crtc;
+   struct drm_crtc_commit *commit;
+   long ret;
+
+   /* Adding all modeset locks to aquire_ctx will
+* ensure that when the framework release it the
+* extra locks we are locking here will get released to
+*/
+   drm_modeset_lock_all_ctx(dev, state->acquire_ctx);
+
+   list_for_each_entry(crtc, >mode_config.crtc_list, head) {
+   spin_lock(>commit_lock);
+   commit = list_first_entry_or_null(>commit_list,
+   struct drm_crtc_commit, commit_entry);
+   if (commit)
+   drm_crtc_commit_get(commit);
+   spin_unlock(>commit_lock);
+
+   if (!commit)
+   continue;
+
+   /* Make sure all pending HW programming completed and
+* page flips done
+*/
+   ret = wait_for_completion_timeout(>hw_done,
+ 10*HZ);
+   ret = wait_for_completion_timeout(>flip_done,
+ 10*HZ);
+   if (ret == 0)
+   DRM_ERROR("[CRTC:%d:%s] hw_done timed out\n",
+ crtc->base.id, crtc->name);
+   drm_crtc_commit_put(commit);
+   }
+}
+
 int amdgpu_dm_atomic_check(struct drm_device *dev,
struct drm_atomic_state *state)
 {
@@ -3277,7 +3320,7 @@ int amdgpu_dm_atomic_check(struct drm_device *dev,
dc,
set[i].surfaces,
set[i].surface_count,
-   set[i].stream) > UPDATE_TYPE_MED) {
+   set[i].stream) > UPDATE_TYPE_FAST) {
wait_for_prev_commits = true;
break;
}
@@ -3291,25 +3334,14 @@ int amdgpu_dm_atomic_check(struct drm_device *dev,
 * For full updates case when
 * removing/adding/updateding  streams on once CRTC while 
flipping
 * on another CRTC,
-* Adding all current active CRTC's states to the atomic commit 
in
-* amdgpu_dm_atomic_check will guarantee that any such full 
update commit
+* acquiring global lock  will guarantee that any such full
+* update commit
 * will wait for completion of any outstanding flip using DRMs
 * synchronization events.
 */
-   if (wait_for_prev_commits) {
-   list_for_each_entry(crtc, >mode_config.crtc_list, 
head) {
-   struct amdgpu_crtc *acrtc = 
to_amdgpu_crtc(crtc);
-   struct drm_crtc_state *crtc_state;
-
-   if (acrtc->stream) {
-   crtc_state = 
drm_atomic_get_crtc_state(state, crtc);
-   if (IS_ERR(crtc_state)) {
-   ret = PTR_ERR(crtc_state);
-   break;
-   }
-   }
-   }
-   }
+   if (wait_for_prev_commits)
+   aquire_global_lock(dev, state);
+
}
 
if (context) {
-- 
2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 11/13] drm/amd/display: Disable pipe split.

2017-06-26 Thread Harry Wentland
From: Yongqiang Sun 

Change-Id: I7d3c8677e236003386ddc7dfd3511b6a13a89829
Signed-off-by: Yongqiang Sun 
Reviewed-by: Hersen Wu 
Acked-by: Harry Wentland 
---
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c
index c7840e0e3ae5..eebaffca8e75 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c
@@ -443,7 +443,7 @@ static const struct dc_debug debug_defaults_drv = {
.disable_pplib_wm_range = false,
 #if defined(CONFIG_DRM_AMD_DC_DCN1_0)
.use_dml_wm = false,
-   .disable_pipe_split = false
+   .disable_pipe_split = true
 #endif
 };
 
-- 
2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 05/13] drm/amd/display: make variable latency into a regkey option

2017-06-26 Thread Harry Wentland
From: Dmytro Laktyushkin 

Change-Id: I4bf181d16258f8fa871ebd4fa41b2140f9bf0c60
Signed-off-by: Dmytro Laktyushkin 
Reviewed-by: Charlene Liu 
Acked-by: Harry Wentland 
---
 drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c | 11 ---
 drivers/gpu/drm/amd/display/dc/dc.h  |  2 ++
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c 
b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
index 9cb08365e2b6..3ec702fecfd1 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
+++ b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
@@ -906,11 +906,16 @@ bool dcn_validate_bandwidth(
scaler_settings_calculation(v);
mode_support_and_system_configuration(v);
 
-   if (v->voltage_level == 0) {
+   if (v->voltage_level == 0 &&
+   (dc->public.debug.sr_exit_time_dpm0_ns
+   || 
dc->public.debug.sr_enter_plus_exit_time_dpm0_ns)) {
struct core_dc *dc_core = DC_TO_CORE(>public);
 
-   v->sr_enter_plus_exit_time = 9.466f;
-   v->sr_exit_time = 7.849f;
+   if (dc->public.debug.sr_enter_plus_exit_time_dpm0_ns)
+   v->sr_enter_plus_exit_time =
+   
dc->public.debug.sr_enter_plus_exit_time_dpm0_ns / 1000.0f;
+   if (dc->public.debug.sr_exit_time_dpm0_ns)
+   v->sr_exit_time =  
dc->public.debug.sr_exit_time_dpm0_ns / 1000.0f;
dc_core->dml.soc.sr_enter_plus_exit_time_us = 
v->sr_enter_plus_exit_time;
dc_core->dml.soc.sr_exit_time_us = v->sr_exit_time;
mode_support_and_system_configuration(v);
diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 62493c4a47d1..cb70b6d8be10 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -171,6 +171,8 @@ struct dc_debug {
bool disable_pplib_wm_range;
bool use_dml_wm;
bool disable_pipe_split;
+   int sr_exit_time_dpm0_ns;
+   int sr_enter_plus_exit_time_dpm0_ns;
int sr_exit_time_ns;
int sr_enter_plus_exit_time_ns;
int urgent_latency_ns;
-- 
2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 10/13] drm/amd/display: Global lock typos fix.

2017-06-26 Thread Harry Wentland
From: Andrey Grodzovsky 

Fix typos.

Change-Id: Ib152dffe835dd7b11ad2fca20d467dd36edca70d
Signed-off-by: Andrey Grodzovsky 
Reviewed-by: Roman Li 
Acked-by: Harry Wentland 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
index 7fd5b5c73e81..2aa7c5010da1 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
@@ -3010,7 +3010,7 @@ static int do_aquire_global_lock(
>flip_done, 10*HZ);
 
if (ret == 0)
-   DRM_ERROR("[CRTC:%d:%s] cleanup_done or flip_done "
+   DRM_ERROR("[CRTC:%d:%s] hw_done or flip_done "
"timed out\n", crtc->base.id, 
crtc->name);
 
drm_crtc_commit_put(commit);
@@ -3301,9 +3301,9 @@ int amdgpu_dm_atomic_check(struct drm_device *dev,
 
if (ret != 0) {
if (ret == -EDEADLK)
-   DRM_DEBUG_KMS("Atomic check stopped due to to deadlock, 
retrying.\n");
+   DRM_DEBUG_KMS("Atomic check stopped due to to 
deadlock.\n");
else if (ret == -EINTR || ret == -EAGAIN || ret == -ERESTARTSYS)
-   DRM_DEBUG_KMS("Atomic check stopped due to to signal, 
retrying.\n");
+   DRM_DEBUG_KMS("Atomic check stopped due to to 
signal.\n");
else
DRM_ERROR("Atomic check failed.\n");
}
-- 
2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 00/13] DC Linux Patches Jun 26, 2017

2017-06-26 Thread Harry Wentland
 * Global lock to properly lock DC commits
 * Fix single MST support (atomic_get_property missing)
 * IGT fix for multi-plane configurations
 * Updated DCN bandwidth formula
 * Bunch of minor fixes

Andrey Grodzovsky (5):
  drm/amd/display: MST atomic_get_property missing.
  drm/amd/display: Add global lock function.
  drm/amd/display: Remove check update type function.
  drm/amd/display: Refine globallock.
  drm/amd/display: Global lock typos fix.

Charlene Liu (1):
  drm/amd/display: w/a no color space info for HDMI when build AVI

Dmytro Laktyushkin (4):
  drm/amd/display: use different sr latencies for dpm0 dcn bw calc
  drm/amd/display: make variable latency into a regkey option
  drm/amd/display: dcn bw_calc_auto update rev 247 to 250
  drm/amd/display: fix bw_calc_auto translation error

Harry Wentland (1):
  drm/amd/display: Don't program scaler if we have no surface

Leo (Sunpeng) Li (1):
  drm/amd/display: Workaround IGT multiplane restriction

Yongqiang Sun (1):
  drm/amd/display: Disable pipe split.

 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  |   14 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c|3 +-
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_types.c|  149 +-
 .../gpu/drm/amd/display/dc/calcs/dcn_calc_auto.c   | 3535 +---
 drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c   |   23 +
 drivers/gpu/drm/amd/display/dc/core/dc_resource.c  |3 +
 drivers/gpu/drm/amd/display/dc/dc.h|2 +
 .../amd/display/dc/dce110/dce110_hw_sequencer.c|8 +-
 .../gpu/drm/amd/display/dc/dcn10/dcn10_resource.c  |2 +-
 9 files changed, 1011 insertions(+), 2728 deletions(-)

-- 
2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 04/13] drm/amd/display: use different sr latencies for dpm0 dcn bw calc

2017-06-26 Thread Harry Wentland
From: Dmytro Laktyushkin 

Change-Id: I5c49ef7cb779c1238796a2d071432255653afa64
Signed-off-by: Dmytro Laktyushkin 
Reviewed-by: Charlene Liu 
Acked-by: Harry Wentland 
---
 drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c 
b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
index 0aa6662650cc..9cb08365e2b6 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
+++ b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
@@ -906,6 +906,16 @@ bool dcn_validate_bandwidth(
scaler_settings_calculation(v);
mode_support_and_system_configuration(v);
 
+   if (v->voltage_level == 0) {
+   struct core_dc *dc_core = DC_TO_CORE(>public);
+
+   v->sr_enter_plus_exit_time = 9.466f;
+   v->sr_exit_time = 7.849f;
+   dc_core->dml.soc.sr_enter_plus_exit_time_us = 
v->sr_enter_plus_exit_time;
+   dc_core->dml.soc.sr_exit_time_us = v->sr_exit_time;
+   mode_support_and_system_configuration(v);
+   }
+
if (v->voltage_level != 5) {
float bw_consumed = 
v->total_bandwidth_consumed_gbyte_per_second;
if (bw_consumed < v->fabric_and_dram_bandwidth_vmin0p65)
@@ -1013,6 +1023,14 @@ bool dcn_validate_bandwidth(
>dml, context, pool);
}
 
+   if (v->voltage_level == 0) {
+   struct core_dc *dc_core = DC_TO_CORE(>public);
+
+   dc_core->dml.soc.sr_enter_plus_exit_time_us =
+   dc_core->dcn_soc.sr_enter_plus_exit_time;
+   dc_core->dml.soc.sr_exit_time_us = 
dc_core->dcn_soc.sr_exit_time;
+   }
+
kernel_fpu_end();
return v->voltage_level != 5;
 }
-- 
2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 07/13] drm/amd/display: Refine globallock.

2017-06-26 Thread Harry Wentland
From: Andrey Grodzovsky 

Switch to wait_for_completion_interruptible_timeout wait
since the lock is called from IOCTL context and can be
interrupted by a signal.

Global lock function might return EDEADLK or EINTR which
is not an error and just singals to user mode to restart
the call.

Change-Id: Ibe11bba5ed75f8f291e3eff4035e6b868a01498c
Signed-off-by: Andrey Grodzovsky 
Reviewed-by: Harry Wentland 
---
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_types.c| 35 +++---
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
index dc635db4cc7e..7fd5b5c73e81 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
@@ -2973,7 +2973,7 @@ static uint32_t remove_from_val_sets(
  * Grabs all modesetting locks to serialize against any blocking commits,
  * Waits for completion of all non blocking commits.
  */
-static void do_aquire_global_lock(
+static int do_aquire_global_lock(
struct drm_device *dev,
struct drm_atomic_state *state)
 {
@@ -2985,7 +2985,9 @@ static void do_aquire_global_lock(
 * ensure that when the framework release it the
 * extra locks we are locking here will get released to
 */
-   drm_modeset_lock_all_ctx(dev, state->acquire_ctx);
+   ret = drm_modeset_lock_all_ctx(dev, state->acquire_ctx);
+   if (ret)
+   return ret;
 
list_for_each_entry(crtc, >mode_config.crtc_list, head) {
spin_lock(>commit_lock);
@@ -3001,15 +3003,20 @@ static void do_aquire_global_lock(
/* Make sure all pending HW programming completed and
 * page flips done
 */
-   ret = wait_for_completion_timeout(>hw_done,
- 10*HZ);
-   ret = wait_for_completion_timeout(>flip_done,
- 10*HZ);
+   ret = 
wait_for_completion_interruptible_timeout(>hw_done, 10*HZ);
+
+   if (ret > 0)
+   ret = wait_for_completion_interruptible_timeout(
+   >flip_done, 10*HZ);
+
if (ret == 0)
-   DRM_ERROR("[CRTC:%d:%s] hw_done timed out\n",
- crtc->base.id, crtc->name);
+   DRM_ERROR("[CRTC:%d:%s] cleanup_done or flip_done "
+   "timed out\n", crtc->base.id, 
crtc->name);
+
drm_crtc_commit_put(commit);
}
+
+   return ret < 0 ? ret : 0;
 }
 
 int amdgpu_dm_atomic_check(struct drm_device *dev,
@@ -3276,7 +3283,7 @@ int amdgpu_dm_atomic_check(struct drm_device *dev,
 * synchronization events.
 */
if (aquire_global_lock)
-   do_aquire_global_lock(dev, state);
+   ret = do_aquire_global_lock(dev, state);
 
}
 
@@ -3292,8 +3299,14 @@ int amdgpu_dm_atomic_check(struct drm_device *dev,
for (i = 0; i < new_stream_count; i++)
dc_stream_release(new_streams[i]);
 
-   if (ret != 0)
-   DRM_ERROR("Atomic check failed.\n");
+   if (ret != 0) {
+   if (ret == -EDEADLK)
+   DRM_DEBUG_KMS("Atomic check stopped due to to deadlock, 
retrying.\n");
+   else if (ret == -EINTR || ret == -EAGAIN || ret == -ERESTARTSYS)
+   DRM_DEBUG_KMS("Atomic check stopped due to to signal, 
retrying.\n");
+   else
+   DRM_ERROR("Atomic check failed.\n");
+   }
 
return ret;
 }
-- 
2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 03/13] drm/amd/display: Remove check update type function.

2017-06-26 Thread Harry Wentland
From: Andrey Grodzovsky 

Due to using dc_commit_surface_to_stream instead of build
stream and surface updates any surface commit today is
evlauted to full. Until we fix this and can corretly
evluate type of surface update, anything which is not page
flip or cursor update will be treted as full update chnage
and global lock will be aquired.

Change-Id: I4cc831d83d2b8c4f68822021e606c415192f8046
Signed-off-by: Andrey Grodzovsky 
Reviewed-by: Harry Wentland 
---
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_types.c| 80 +++---
 1 file changed, 8 insertions(+), 72 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
index 5d8bb9c2fa4a..dc635db4cc7e 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
@@ -2969,63 +2969,11 @@ static uint32_t remove_from_val_sets(
return set_count;
 }
 
-
-static enum surface_update_type  amdgpu_dm_check_surfaces_update_type(
-   struct dc *dc,
-   const struct dc_surface **new_surfaces,
-   uint8_t new_surface_count,
-   const struct dc_stream *dc_stream)
-{
-   struct dc_surface_update srf_updates[MAX_SURFACES];
-   struct dc_flip_addrs flip_addr[MAX_SURFACES];
-   struct dc_plane_info plane_info[MAX_SURFACES];
-   struct dc_scaling_info scaling_info[MAX_SURFACES];
-   int i;
-   const struct dc_stream_status *stream_status =
-   dc_stream_get_status(dc_stream);
-   enum surface_update_type update_type;
-
-   memset(srf_updates, 0, sizeof(srf_updates));
-   memset(flip_addr, 0, sizeof(flip_addr));
-   memset(plane_info, 0, sizeof(plane_info));
-   memset(scaling_info, 0, sizeof(scaling_info));
-
-   for (i = 0; i < new_surface_count; i++) {
-   srf_updates[i].surface = new_surfaces[i];
-   srf_updates[i].gamma =
-   (struct dc_gamma *)new_surfaces[i]->gamma_correction;
-   flip_addr[i].address = new_surfaces[i]->address;
-   flip_addr[i].flip_immediate = new_surfaces[i]->flip_immediate;
-   plane_info[i].color_space = new_surfaces[i]->color_space;
-   plane_info[i].format = new_surfaces[i]->format;
-   plane_info[i].plane_size = new_surfaces[i]->plane_size;
-   plane_info[i].rotation = new_surfaces[i]->rotation;
-   plane_info[i].horizontal_mirror = 
new_surfaces[i]->horizontal_mirror;
-   plane_info[i].stereo_format = new_surfaces[i]->stereo_format;
-   plane_info[i].tiling_info = new_surfaces[i]->tiling_info;
-   plane_info[i].visible = new_surfaces[i]->visible;
-   plane_info[i].dcc = new_surfaces[i]->dcc;
-   scaling_info[i].scaling_quality = 
new_surfaces[i]->scaling_quality;
-   scaling_info[i].src_rect = new_surfaces[i]->src_rect;
-   scaling_info[i].dst_rect = new_surfaces[i]->dst_rect;
-   scaling_info[i].clip_rect = new_surfaces[i]->clip_rect;
-
-   srf_updates[i].flip_addr = _addr[i];
-   srf_updates[i].plane_info = _info[i];
-   srf_updates[i].scaling_info = _info[i];
-   }
-
-   update_type = dc_check_update_surfaces_for_stream(
-   dc, srf_updates, new_surface_count, NULL, 
stream_status);
-
-   return update_type;
-}
-
 /*`
  * Grabs all modesetting locks to serialize against any blocking commits,
  * Waits for completion of all non blocking commits.
  */
-static void aquire_global_lock(
+static void do_aquire_global_lock(
struct drm_device *dev,
struct drm_atomic_state *state)
 {
@@ -3088,7 +3036,7 @@ int amdgpu_dm_atomic_check(struct drm_device *dev,
 * This bool will be set for true for any modeset/reset
 * or surface update which implies non fast surfae update.
 */
-   bool wait_for_prev_commits = false;
+   bool aquire_global_lock = false;
 
ret = drm_atomic_helper_check(dev, state);
 
@@ -3173,7 +3121,7 @@ int amdgpu_dm_atomic_check(struct drm_device *dev,
 
new_stream_count++;
need_to_validate = true;
-   wait_for_prev_commits = true;
+   aquire_global_lock = true;
 
} else if (modereset_required(crtc_state)) {
 
@@ -3183,7 +3131,7 @@ int amdgpu_dm_atomic_check(struct drm_device *dev,
set,
set_count,
acrtc->stream);
-   wait_for_prev_commits = true;
+   aquire_global_lock = true;
}
}
 

[PATCH 08/13] drm/amd/display: fix bw_calc_auto translation error

2017-06-26 Thread Harry Wentland
From: Dmytro Laktyushkin 

The compiler was warning about conditions that will never evaluate
to true. The problem was that the VBA translater didn't translate
the conditions correctly.

Change-Id: Ie5b3e674460bdd46e9854480c2fa5ae732427dea
Signed-off-by: Dmytro Laktyushkin 
Reviewed-by: Harry Wentland 
---
 drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.c 
b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.c
index bb2f8ad6a988..fb5d8db33a82 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.c
+++ b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.c
@@ -128,7 +128,7 @@ void mode_support_and_system_configuration(struct 
dcn_bw_internal_vars *v)
 
v->source_format_pixel_and_scan_support = dcn_bw_yes;
for (k = 0; k <= v->number_of_active_planes - 1; k++) {
-   if ((v->source_surface_mode[k] == dcn_bw_sw_linear && 
!v->source_scan[k] == dcn_bw_hor) || ((v->source_surface_mode[k] == 
dcn_bw_sw_4_kb_d || v->source_surface_mode[k] == dcn_bw_sw_4_kb_d_x || 
v->source_surface_mode[k] == dcn_bw_sw_64_kb_d || v->source_surface_mode[k] == 
dcn_bw_sw_64_kb_d_t || v->source_surface_mode[k] == dcn_bw_sw_64_kb_d_x || 
v->source_surface_mode[k] == dcn_bw_sw_var_d || v->source_surface_mode[k] == 
dcn_bw_sw_var_d_x) && !v->source_pixel_format[k] == dcn_bw_rgb_sub_64)) {
+   if ((v->source_surface_mode[k] == dcn_bw_sw_linear && 
v->source_scan[k] != dcn_bw_hor) || ((v->source_surface_mode[k] == 
dcn_bw_sw_4_kb_d || v->source_surface_mode[k] == dcn_bw_sw_4_kb_d_x || 
v->source_surface_mode[k] == dcn_bw_sw_64_kb_d || v->source_surface_mode[k] == 
dcn_bw_sw_64_kb_d_t || v->source_surface_mode[k] == dcn_bw_sw_64_kb_d_x || 
v->source_surface_mode[k] == dcn_bw_sw_var_d || v->source_surface_mode[k] == 
dcn_bw_sw_var_d_x) && v->source_pixel_format[k] != dcn_bw_rgb_sub_64)) {
v->source_format_pixel_and_scan_support = dcn_bw_no;
}
}
-- 
2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 12/13] drm/amd/display: Don't program scaler if we have no surface

2017-06-26 Thread Harry Wentland
If we don't have a surface in dc_commit_streams scl_data won't get
populated in resource_build_scaling_params_for_context. In this case we
shouldn't attempt to program the scaler.

Change-Id: Ib901a062e4b0b897629ee1a1eaf2e7e1a5334cbf
Signed-off-by: Harry Wentland 
Reviewed-by: Dmytro Laktyushkin 
Acked-by: Harry Wentland 
---
 drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index a83d260cda45..0bab85b27a85 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -1099,10 +1099,12 @@ static enum dc_status apply_single_controller_ctx_to_hw(
 
pipe_ctx->scl_data.lb_params.alpha_en = pipe_ctx->bottom_pipe != 0;
/* program_scaler and allocate_mem_input are not new asic */
-   if (!pipe_ctx_old || memcmp(_ctx_old->scl_data,
-   _ctx->scl_data,
-   sizeof(struct scaler_data)) != 0)
+   if ((!pipe_ctx_old ||
+memcmp(_ctx_old->scl_data, _ctx->scl_data,
+   sizeof(struct scaler_data)) != 0) &&
+pipe_ctx->surface) {
program_scaler(dc, pipe_ctx);
+   }
 
/* mst support - use total stream count */
 #if defined(CONFIG_DRM_AMD_DC_DCN1_0)
-- 
2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 09/13] drm/amd/display: Workaround IGT multiplane restriction

2017-06-26 Thread Harry Wentland
From: "Leo (Sunpeng) Li" 

IGT currently does not properly commit changes on planes with multiple
possible CRTC's. Set one valid CRTC for each plane for now, plus one
underlay plane on Carizzo and Stoney that is valid for all CRTCs.

Change-Id: Ifcc37754a6e93e6fd9693d028354fa678c3a5d72
Signed-off-by: Leo (Sunpeng) Li 
Reviewed-by: Harry Wentland 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 3bdf683e3e81..605da366ae10 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1189,6 +1189,7 @@ int amdgpu_dm_initialize_drm_device(struct amdgpu_device 
*adev)
struct amdgpu_encoder *aencoder = NULL;
struct amdgpu_mode_info *mode_info = >mode_info;
uint32_t link_cnt;
+   unsigned long possible_crtcs;
 
link_cnt = dm->dc->caps.max_links;
if (amdgpu_dm_mode_config_init(dm->adev)) {
@@ -1204,7 +1205,18 @@ int amdgpu_dm_initialize_drm_device(struct amdgpu_device 
*adev)
goto fail_free_planes;
}
mode_info->planes[i]->base.type = mode_info->plane_type[i];
-   if (amdgpu_dm_plane_init(dm, mode_info->planes[i], 0xff)) {
+
+   /*
+* HACK: IGT tests expect that each plane can only have one
+* one possible CRTC. For now, set one CRTC for each
+* plane that is not an underlay, but still allow multiple
+* CRTCs for underlay planes.
+*/
+   possible_crtcs = 1 << i;
+   if (i >= dm->dc->caps.max_streams)
+   possible_crtcs = 0xff;
+
+   if (amdgpu_dm_plane_init(dm, mode_info->planes[i], 
possible_crtcs)) {
DRM_ERROR("KMS: Failed to initialize plane\n");
goto fail_free_planes;
}
-- 
2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 13/13] drm/amd/display: w/a no color space info for HDMI when build AVI

2017-06-26 Thread Harry Wentland
From: Charlene Liu 

Change-Id: Ib1f8c3b34b0144edade8331b0c2782c1df793020
Signed-off-by: Charlene Liu 
Reviewed-by: Aric Cyr 
Acked-by: Harry Wentland 
---
 drivers/gpu/drm/amd/display/dc/core/dc_resource.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index a4c8c43873b4..9aff47eb33bd 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -1605,6 +1605,9 @@ static void set_avi_info_frame(
union hdmi_info_packet *hdmi_info = 
_frame.avi_info_packet.info_packet_hdmi;
 
color_space = pipe_ctx->stream->public.output_color_space;
+   if (color_space == COLOR_SPACE_UNKNOWN)
+   color_space = (stream->public.timing.pixel_encoding == 
PIXEL_ENCODING_RGB)?
+   COLOR_SPACE_SRGB:COLOR_SPACE_YCBCR709;
 
/* Initialize header */
hdmi_info->bits.header.info_frame_type = HDMI_INFOFRAME_TYPE_AVI;
-- 
2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 7/8] drm/amdgpu: add plumbing for ctx priority changes v2

2017-06-26 Thread Andres Rodriguez
Introduce amdgpu_ctx_priority_override(). A mechanism to override a
context's priority.

An override can be terminated by setting the override to
AMD_SCHED_PRIORITY_UNSET.

v2: change refcounted interface for a direct set

Signed-off-by: Andres Rodriguez 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h |  6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 29 +
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 7c6fca6..0a24c1f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -825,7 +825,9 @@ struct amdgpu_ctx {
spinlock_t  ring_lock;
struct dma_fence**fences;
struct amdgpu_ctx_ring  rings[AMDGPU_MAX_RINGS];
-   bool preamble_presented;
+   boolpreamble_presented;
+   enum amd_sched_priority init_priority;
+   enum amd_sched_priority override_priority;
 };
 
 struct amdgpu_ctx_mgr {
@@ -842,6 +844,8 @@ uint64_t amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx, 
struct amdgpu_ring *ring,
  struct dma_fence *fence);
 struct dma_fence *amdgpu_ctx_get_fence(struct amdgpu_ctx *ctx,
   struct amdgpu_ring *ring, uint64_t seq);
+void amdgpu_ctx_priority_override(struct amdgpu_ctx *ctx,
+ enum amd_sched_priority priority);
 
 int amdgpu_ctx_ioctl(struct drm_device *dev, void *data,
 struct drm_file *filp);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index e4de0fb..bf05180 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -72,6 +72,8 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev,
}
 
ctx->reset_counter = atomic_read(>gpu_reset_counter);
+   ctx->init_priority = priority;
+   ctx->override_priority = AMD_SCHED_PRIORITY_UNSET;
 
/* create context entity for each ring */
for (i = 0; i < adev->num_rings; i++) {
@@ -360,6 +362,33 @@ struct dma_fence *amdgpu_ctx_get_fence(struct amdgpu_ctx 
*ctx,
return fence;
 }
 
+void amdgpu_ctx_priority_override(struct amdgpu_ctx *ctx,
+ enum amd_sched_priority priority)
+{
+   int i;
+   struct amdgpu_device *adev = ctx->adev;
+   struct amd_sched_rq *rq;
+   struct amd_sched_entity *entity;
+   struct amdgpu_ring *ring;
+   enum amd_sched_priority ctx_prio;
+
+   ctx->override_priority = priority;
+
+   ctx_prio = (ctx->override_priority == AMD_SCHED_PRIORITY_UNSET) ?
+   ctx->init_priority : ctx->override_priority;
+
+   for (i = 0; i < adev->num_rings; i++) {
+   ring = adev->rings[i];
+   entity = >rings[i].entity;
+   rq = >sched.sched_rq[ctx_prio];
+
+   if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
+   continue;
+
+   amd_sched_entity_set_rq(entity, rq);
+   }
+}
+
 void amdgpu_ctx_mgr_init(struct amdgpu_ctx_mgr *mgr)
 {
mutex_init(>lock);
-- 
2.9.3

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 8/8] drm/amdgpu: add interface for editing a foreign process's priority v3

2017-06-26 Thread Andres Rodriguez
The AMDGPU_SCHED_OP_PROCESS_PRIORITY_OVERRIDE ioctls are used to set the
priority of a different process in the current system.

When a request is dropped, the process's contexts will be
restored to the priority specified at context creation time.

A request can be dropped by setting the override priority to
AMDGPU_CTX_PRIORITY_UNSET.

An fd is used to identify the remote process. This is simpler than
passing a pid number, which is vulnerable to re-use, etc.

This functionality is limited to DRM_MASTER since abuse of this
interface can have a negative impact on the system's performance.

v2: removed unused output structure
v3: change refcounted interface for a regular set operation

Signed-off-by: Andres Rodriguez 
---
 drivers/gpu/drm/amd/amdgpu/Makefile   |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c   |  21 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c   |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c | 109 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_sched.h |  34 ++
 include/uapi/drm/amdgpu_drm.h |  17 +
 6 files changed, 164 insertions(+), 21 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_sched.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index faea634..f039d88 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -25,7 +25,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \
amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o amdgpu_atomfirmware.o \
-   amdgpu_queue_mgr.o
+   amdgpu_queue_mgr.o amdgpu_sched.o
 
 # add asic specific block
 amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o kv_dpm.o \
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index bf05180..97aafc2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include "amdgpu.h"
+#include "amdgpu_sched.h"
 
 static int amdgpu_ctx_priority_permit(struct drm_file *filp,
  enum amd_sched_priority priority)
@@ -220,26 +221,6 @@ static int amdgpu_ctx_query(struct amdgpu_device *adev,
return 0;
 }
 
-static enum amd_sched_priority amdgpu_to_sched_priority(int amdgpu_priority)
-{
-   switch (amdgpu_priority) {
-   case AMDGPU_CTX_PRIORITY_HIGH_HW:
-   return AMD_SCHED_PRIORITY_HIGH_HW;
-   case AMDGPU_CTX_PRIORITY_HIGH_SW:
-   return AMD_SCHED_PRIORITY_HIGH_SW;
-   case AMDGPU_CTX_PRIORITY_NORMAL:
-   return AMD_SCHED_PRIORITY_NORMAL;
-   case AMDGPU_CTX_PRIORITY_LOW_SW:
-   case AMDGPU_CTX_PRIORITY_LOW_HW:
-   return AMD_SCHED_PRIORITY_LOW;
-   case AMDGPU_CTX_PRIORITY_UNSET:
-   return AMD_SCHED_PRIORITY_UNSET;
-   default:
-   WARN(1, "Invalid context priority %d\n", amdgpu_priority);
-   return AMD_SCHED_PRIORITY_INVALID;
-   }
-}
-
 int amdgpu_ctx_ioctl(struct drm_device *dev, void *data,
 struct drm_file *filp)
 {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 12497a4..80483ab 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -28,6 +28,7 @@
 #include 
 #include "amdgpu.h"
 #include 
+#include "amdgpu_sched.h"
 #include "amdgpu_uvd.h"
 #include "amdgpu_vce.h"
 
@@ -1017,6 +1018,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, 
DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, 
DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(AMDGPU_VM, amdgpu_vm_ioctl, 
DRM_AUTH|DRM_RENDER_ALLOW),
+   DRM_IOCTL_DEF_DRV(AMDGPU_SCHED, amdgpu_sched_ioctl, DRM_MASTER),
DRM_IOCTL_DEF_DRV(AMDGPU_BO_LIST, amdgpu_bo_list_ioctl, 
DRM_AUTH|DRM_RENDER_ALLOW),
/* KMS */
DRM_IOCTL_DEF_DRV(AMDGPU_GEM_MMAP, amdgpu_gem_mmap_ioctl, 
DRM_AUTH|DRM_RENDER_ALLOW),
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
new file mode 100644
index 000..cd12330
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
@@ -0,0 +1,109 @@
+/*
+ * Copyright 2017 Valve Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished 

[PATCH 1/8] drm/amdgpu: add parameter to allocate high priority contexts v11

2017-06-26 Thread Andres Rodriguez
Add a new context creation parameter to express a global context priority.

The priority ranking in descending order is as follows:
 * AMDGPU_CTX_PRIORITY_HIGH_HW
 * AMDGPU_CTX_PRIORITY_HIGH_SW
 * AMDGPU_CTX_PRIORITY_NORMAL
 * AMDGPU_CTX_PRIORITY_LOW_SW
 * AMDGPU_CTX_PRIORITY_LOW_HW

The driver will attempt to schedule work to the hardware according to
the priorities. No latency or throughput guarantees are provided by
this patch.

This interface intends to service the EGL_IMG_context_priority
extension, and vulkan equivalents.

Setting a priority above NORMAL requires CAP_SYS_NICE or DRM_MASTER.

v2: Instead of using flags, repurpose __pad
v3: Swap enum values of _NORMAL _HIGH for backwards compatibility
v4: Validate usermode priority and store it
v5: Move priority validation into amdgpu_ctx_ioctl(), headline reword
v6: add UAPI note regarding priorities requiring CAP_SYS_ADMIN
v7: remove ctx->priority
v8: added AMDGPU_CTX_PRIORITY_LOW, s/CAP_SYS_ADMIN/CAP_SYS_NICE
v9: change the priority parameter to __s32
v10: split priorities into _SW and _HW
v11: Allow DRM_MASTER without CAP_SYS_NICE

Reviewed-by: Emil Velikov 
Reviewed-by: Christian König 
Signed-off-by: Andres Rodriguez 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c   | 61 +--
 drivers/gpu/drm/amd/scheduler/gpu_scheduler.h |  5 ++-
 include/uapi/drm/amdgpu_drm.h | 10 -
 3 files changed, 70 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index a11e443..9ec85d5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -23,13 +23,40 @@
  */
 
 #include 
+#include 
 #include "amdgpu.h"
 
-static int amdgpu_ctx_init(struct amdgpu_device *adev, struct amdgpu_ctx *ctx)
+static int amdgpu_ctx_priority_permit(struct drm_file *filp,
+ enum amd_sched_priority priority)
+{
+   /* NORMAL and below are accessible by everyone */
+   if (priority <= AMD_SCHED_PRIORITY_NORMAL)
+   return 0;
+
+   if (capable(CAP_SYS_NICE))
+   return 0;
+
+   if (drm_is_current_master(filp))
+   return 0;
+
+   return -EACCES;
+}
+
+static int amdgpu_ctx_init(struct amdgpu_device *adev,
+  enum amd_sched_priority priority,
+  struct drm_file *filp,
+  struct amdgpu_ctx *ctx)
 {
unsigned i, j;
int r;
 
+   if (priority < 0 || priority >= AMD_SCHED_PRIORITY_MAX)
+   return -EINVAL;
+
+   r = amdgpu_ctx_priority_permit(filp, priority);
+   if (r)
+   return r;
+
memset(ctx, 0, sizeof(*ctx));
ctx->adev = adev;
kref_init(>refcount);
@@ -51,7 +78,7 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev, struct 
amdgpu_ctx *ctx)
struct amdgpu_ring *ring = adev->rings[i];
struct amd_sched_rq *rq;
 
-   rq = >sched.sched_rq[AMD_SCHED_PRIORITY_NORMAL];
+   rq = >sched.sched_rq[priority];
 
if (ring == >gfx.kiq.ring)
continue;
@@ -100,6 +127,8 @@ static void amdgpu_ctx_fini(struct amdgpu_ctx *ctx)
 
 static int amdgpu_ctx_alloc(struct amdgpu_device *adev,
struct amdgpu_fpriv *fpriv,
+   struct drm_file *filp,
+   enum amd_sched_priority priority,
uint32_t *id)
 {
struct amdgpu_ctx_mgr *mgr = >ctx_mgr;
@@ -117,8 +146,9 @@ static int amdgpu_ctx_alloc(struct amdgpu_device *adev,
kfree(ctx);
return r;
}
+
*id = (uint32_t)r;
-   r = amdgpu_ctx_init(adev, ctx);
+   r = amdgpu_ctx_init(adev, priority, filp, ctx);
if (r) {
idr_remove(>ctx_handles, *id);
*id = 0;
@@ -188,11 +218,30 @@ static int amdgpu_ctx_query(struct amdgpu_device *adev,
return 0;
 }
 
+static enum amd_sched_priority amdgpu_to_sched_priority(int amdgpu_priority)
+{
+   switch (amdgpu_priority) {
+   case AMDGPU_CTX_PRIORITY_HIGH_HW:
+   return AMD_SCHED_PRIORITY_HIGH_HW;
+   case AMDGPU_CTX_PRIORITY_HIGH_SW:
+   return AMD_SCHED_PRIORITY_HIGH_SW;
+   case AMDGPU_CTX_PRIORITY_NORMAL:
+   return AMD_SCHED_PRIORITY_NORMAL;
+   case AMDGPU_CTX_PRIORITY_LOW_SW:
+   case AMDGPU_CTX_PRIORITY_LOW_HW:
+   return AMD_SCHED_PRIORITY_LOW;
+   default:
+   WARN(1, "Invalid context priority %d\n", amdgpu_priority);
+   return AMD_SCHED_PRIORITY_NORMAL;
+   }
+}
+
 int amdgpu_ctx_ioctl(struct drm_device *dev, void *data,
 struct drm_file *filp)
 {
int r;
uint32_t id;
+   enum amd_sched_priority priority;
 
union 

[PATCH 3/8] drm/amdgpu: implement ring set_priority for gfx_v8 compute v9

2017-06-26 Thread Andres Rodriguez
Programming CP_HQD_QUEUE_PRIORITY enables a queue to take priority over
other queues on the same pipe. Multiple queues on a pipe are timesliced
so this gives us full precedence over other queues.

Programming CP_HQD_PIPE_PRIORITY changes the SPI_ARB_PRIORITY of the
wave as follows:
0x2: CS_H
0x1: CS_M
0x0: CS_L

The SPI block will then dispatch work according to the policy set by
SPI_ARB_PRIORITY. In the current policy CS_H is higher priority than
gfx.

In order to prevent getting stuck in loops of resources bouncing between
GFX and high priority compute and introducing further latency, we
statically reserve a portion of the pipe.

v2: fix srbm_select to ring->queue and use ring->funcs->type
v3: use AMD_SCHED_PRIORITY_* instead of AMDGPU_CTX_PRIORITY_*
v4: switch int to enum amd_sched_priority
v5: corresponding changes for srbm_lock
v6: change CU reservation to PIPE_PERCENT allocation
v7: use kiq instead of MMIO
v8: back to MMIO, and make the implementation sleep safe.
v9: corresponding changes for splitting HIGH into _HW/_SW

Acked-by: Christian König 
Signed-off-by: Andres Rodriguez 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|   4 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 +
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 105 +
 3 files changed, 111 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index b1e8cd9..7c6fca6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1114,6 +1114,10 @@ struct amdgpu_gfx {
boolin_suspend;
/* NGG */
struct amdgpu_ngg   ngg;
+
+   /* pipe reservation */
+   struct mutexpipe_reserve_mutex;
+   DECLARE_BITMAP  (pipe_reserve_bitmap, 
AMDGPU_MAX_COMPUTE_QUEUES);
 };
 
 int amdgpu_ib_get(struct amdgpu_device *adev, struct amdgpu_vm *vm,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index ae4387f..d7303d3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2019,6 +2019,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
adev->vm_manager.vm_pte_num_rings = 0;
adev->gart.gart_funcs = NULL;
adev->fence_context = dma_fence_context_alloc(AMDGPU_MAX_RINGS);
+   bitmap_zero(adev->gfx.pipe_reserve_bitmap, AMDGPU_MAX_COMPUTE_QUEUES);
 
adev->smc_rreg = _invalid_rreg;
adev->smc_wreg = _invalid_wreg;
@@ -2047,6 +2048,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
mutex_init(>pm.mutex);
mutex_init(>gfx.gpu_clock_mutex);
mutex_init(>srbm_mutex);
+   mutex_init(>gfx.pipe_reserve_mutex);
mutex_init(>grbm_idx_mutex);
mutex_init(>mn_lock);
hash_init(adev->mn_hash);
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 1429242..bee5ef4 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -6461,6 +6461,110 @@ static void gfx_v8_0_ring_set_wptr_compute(struct 
amdgpu_ring *ring)
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
 }
 
+static void gfx_v8_0_ring_set_pipe_percent(struct amdgpu_ring *ring,
+  bool acquire)
+{
+   struct amdgpu_device *adev = ring->adev;
+   int pipe_num, tmp, reg;
+   int pipe_percent = acquire ? SPI_WCL_PIPE_PERCENT_GFX__VALUE_MASK : 0x1;
+
+   pipe_num = ring->me * adev->gfx.mec.num_pipe_per_mec + ring->pipe;
+
+   /* first me only has 2 entries, GFX and HP3D */
+   if (ring->me > 0)
+   pipe_num -= 2;
+
+   /* There is a bug in the GFX pipes that results in a HS
+* deadlock if the pipe is restricted to a percentage
+* lower than 17 */
+   if (ring->funcs->type != AMDGPU_RING_TYPE_COMPUTE && pipe_percent < 17)
+   pipe_percent = 17;
+
+   reg = mmSPI_WCL_PIPE_PERCENT_GFX + pipe_num;
+   tmp = RREG32(reg);
+   tmp = REG_SET_FIELD(tmp, SPI_WCL_PIPE_PERCENT_GFX, VALUE, pipe_percent);
+   WREG32(reg, tmp);
+}
+
+static void gfx_v8_0_pipe_reserve_resources(struct amdgpu_device *adev,
+   struct amdgpu_ring *ring,
+   bool acquire)
+{
+   int i, pipe;
+   bool reserve;
+   struct amdgpu_ring *iring;
+
+   mutex_lock(>gfx.pipe_reserve_mutex);
+   pipe = amdgpu_gfx_queue_to_bit(adev, ring->me, ring->pipe, 0);
+   if (acquire)
+   set_bit(pipe, adev->gfx.pipe_reserve_bitmap);
+   else
+   clear_bit(pipe, adev->gfx.pipe_reserve_bitmap);
+
+   if (!bitmap_weight(adev->gfx.pipe_reserve_bitmap, 
AMDGPU_MAX_COMPUTE_QUEUES)) {
+   /* Clear all reservations - everyone 

[PATCH] allow DRM_MASTER to change client's priorities v2

2017-06-26 Thread Andres Rodriguez
Updated with Christian's request to simplify the process priority
override interface.

Series available in the wip-process-priorities-v2 branch of:
git://people.freedesktop.org/~lostgoat/linux 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 5/8] drm/amd/sched: allow clients to edit an entity's rq v2

2017-06-26 Thread Andres Rodriguez
This is useful for changing an entity's priority at runtime.

v2: don't modify the order of amd_sched_entity members

Signed-off-by: Andres Rodriguez 
---
 drivers/gpu/drm/amd/scheduler/gpu_scheduler.c | 26 +++---
 drivers/gpu/drm/amd/scheduler/gpu_scheduler.h |  3 +++
 2 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c 
b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
index 38cea6f..0166620 100644
--- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
+++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
@@ -133,6 +133,7 @@ int amd_sched_entity_init(struct amd_gpu_scheduler *sched,
entity->rq = rq;
entity->sched = sched;
 
+   spin_lock_init(>rq_lock);
spin_lock_init(>queue_lock);
r = kfifo_alloc(>job_queue, jobs * sizeof(void *), GFP_KERNEL);
if (r)
@@ -204,8 +205,6 @@ static bool amd_sched_entity_is_ready(struct 
amd_sched_entity *entity)
 void amd_sched_entity_fini(struct amd_gpu_scheduler *sched,
   struct amd_sched_entity *entity)
 {
-   struct amd_sched_rq *rq = entity->rq;
-
if (!amd_sched_entity_is_initialized(sched, entity))
return;
 
@@ -215,7 +214,8 @@ void amd_sched_entity_fini(struct amd_gpu_scheduler *sched,
*/
wait_event(sched->job_scheduled, amd_sched_entity_is_idle(entity));
 
-   amd_sched_rq_remove_entity(rq, entity);
+   amd_sched_entity_set_rq(entity, NULL);
+
kfifo_free(>job_queue);
 }
 
@@ -236,6 +236,24 @@ static void amd_sched_entity_clear_dep(struct dma_fence 
*f, struct dma_fence_cb
dma_fence_put(f);
 }
 
+void amd_sched_entity_set_rq(struct amd_sched_entity *entity,
+struct amd_sched_rq *rq)
+{
+   if (entity->rq == rq)
+   return;
+
+   spin_lock(>rq_lock);
+
+   if (entity->rq)
+   amd_sched_rq_remove_entity(entity->rq, entity);
+
+   entity->rq = rq;
+   if (rq)
+   amd_sched_rq_add_entity(rq, entity);
+
+   spin_unlock(>rq_lock);
+}
+
 bool amd_sched_dependency_optimized(struct dma_fence* fence,
struct amd_sched_entity *entity)
 {
@@ -333,7 +351,9 @@ static bool amd_sched_entity_in(struct amd_sched_job 
*sched_job)
/* first job wakes up scheduler */
if (first) {
/* Add the entity to the run queue */
+   spin_lock(>rq_lock);
amd_sched_rq_add_entity(entity->rq, entity);
+   spin_unlock(>rq_lock);
amd_sched_wakeup(sched);
}
return added;
diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h 
b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
index da040bc..4b528f7 100644
--- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
+++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
@@ -39,6 +39,7 @@ struct amd_sched_rq;
 struct amd_sched_entity {
struct list_headlist;
struct amd_sched_rq *rq;
+   spinlock_t  rq_lock;
struct amd_gpu_scheduler*sched;
 
spinlock_t  queue_lock;
@@ -154,6 +155,8 @@ int amd_sched_entity_init(struct amd_gpu_scheduler *sched,
 void amd_sched_entity_fini(struct amd_gpu_scheduler *sched,
   struct amd_sched_entity *entity);
 void amd_sched_entity_push_job(struct amd_sched_job *sched_job);
+void amd_sched_entity_set_rq(struct amd_sched_entity *entity,
+struct amd_sched_rq *rq);
 
 int amd_sched_fence_slab_init(void);
 void amd_sched_fence_slab_fini(void);
-- 
2.9.3

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 6/8] drm/amdgpu: introduce AMDGPU_CTX_PRIORITY_UNSET

2017-06-26 Thread Andres Rodriguez
Use _INVALID to identify bad parameters and _UNSET to represent the
lack of interest in a specific value.

Signed-off-by: Andres Rodriguez 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c   | 2 ++
 drivers/gpu/drm/amd/scheduler/gpu_scheduler.h | 3 ++-
 include/uapi/drm/amdgpu_drm.h | 1 +
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index fbf15dd..e4de0fb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -230,6 +230,8 @@ static enum amd_sched_priority amdgpu_to_sched_priority(int 
amdgpu_priority)
case AMDGPU_CTX_PRIORITY_LOW_SW:
case AMDGPU_CTX_PRIORITY_LOW_HW:
return AMD_SCHED_PRIORITY_LOW;
+   case AMDGPU_CTX_PRIORITY_UNSET:
+   return AMD_SCHED_PRIORITY_UNSET;
default:
WARN(1, "Invalid context priority %d\n", amdgpu_priority);
return AMD_SCHED_PRIORITY_INVALID;
diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h 
b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
index 4b528f7..52c8e54 100644
--- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
+++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
@@ -122,7 +122,8 @@ enum amd_sched_priority {
AMD_SCHED_PRIORITY_HIGH_HW,
AMD_SCHED_PRIORITY_KERNEL,
AMD_SCHED_PRIORITY_MAX,
-   AMD_SCHED_PRIORITY_INVALID = -1
+   AMD_SCHED_PRIORITY_INVALID = -1,
+   AMD_SCHED_PRIORITY_UNSET = -2
 };
 
 /**
diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
index 6e2d92e..53aa9b5 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -163,6 +163,7 @@ union drm_amdgpu_bo_list {
 #define AMDGPU_CTX_UNKNOWN_RESET   3
 
 /* Context priority level */
+#define AMDGPU_CTX_PRIORITY_UNSET   -2048
 #define AMDGPU_CTX_PRIORITY_LOW_HW  -1023
 #define AMDGPU_CTX_PRIORITY_LOW_SW  -512
 #define AMDGPU_CTX_PRIORITY_NORMAL  0
-- 
2.9.3

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/8] drm/amdgpu: add framework for HW specific priority settings v9

2017-06-26 Thread Andres Rodriguez
Add an initial framework for changing the HW priorities of rings. The
framework allows requesting priority changes for the lifetime of an
amdgpu_job. After the job completes the priority will decay to the next
lowest priority for which a request is still valid.

A new ring function set_priority() can now be populated to take care of
the HW specific programming sequence for priority changes.

v2: set priority before emitting IB, and take a ref on amdgpu_job
v3: use AMD_SCHED_PRIORITY_* instead of AMDGPU_CTX_PRIORITY_*
v4: plug amdgpu_ring_restore_priority_cb into amdgpu_job_free_cb
v5: use atomic for tracking job priorities instead of last_job
v6: rename amdgpu_ring_priority_[get/put]() and align parameters
v7: replace spinlocks with mutexes for KIQ compatibility
v8: raise ring priority during cs_ioctl, instead of job_run
v9: priority_get() before push_job()

Signed-off-by: Andres Rodriguez 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c|  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c   |  4 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c  | 76 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h  | 15 ++
 drivers/gpu/drm/amd/scheduler/gpu_scheduler.h |  7 +++
 5 files changed, 103 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index aeee684..2d2d59b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1102,6 +1102,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
job->uf_sequence = cs->out.handle;
amdgpu_job_free_resources(job);
amdgpu_cs_parser_fini(p, 0, true);
+   amdgpu_ring_priority_get(job->ring,
+amd_sched_get_job_priority(>base));
 
trace_amdgpu_cs_ioctl(job);
amd_sched_entity_push_job(>base);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 3d641e1..63b0f3e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -101,6 +101,7 @@ static void amdgpu_job_free_cb(struct amd_sched_job *s_job)
 {
struct amdgpu_job *job = container_of(s_job, struct amdgpu_job, base);
 
+   amdgpu_ring_priority_put(job->ring, amd_sched_get_job_priority(s_job));
dma_fence_put(job->fence);
amdgpu_sync_free(>sync);
amdgpu_sync_free(>dep_sync);
@@ -137,6 +138,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
amdgpu_ring *ring,
job->fence_ctx = entity->fence_context;
*f = dma_fence_get(>base.s_fence->finished);
amdgpu_job_free_resources(job);
+   amdgpu_ring_priority_get(job->ring,
+amd_sched_get_job_priority(>base));
amd_sched_entity_push_job(>base);
 
return 0;
@@ -201,6 +204,7 @@ static struct dma_fence *amdgpu_job_run(struct 
amd_sched_job *sched_job)
/* if gpu reset, hw fence will be replaced here */
dma_fence_put(job->fence);
job->fence = dma_fence_get(fence);
+
amdgpu_job_free_resources(job);
return fence;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 75165e0..2d8b20a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -155,6 +155,75 @@ void amdgpu_ring_undo(struct amdgpu_ring *ring)
 }
 
 /**
+ * amdgpu_ring_priority_put - restore a ring's priority
+ *
+ * @ring: amdgpu_ring structure holding the information
+ * @priority: target priority
+ *
+ * Release a request for executing at @priority
+ */
+void amdgpu_ring_priority_put(struct amdgpu_ring *ring,
+ enum amd_sched_priority priority)
+{
+   int i;
+
+   if (!ring->funcs->set_priority)
+   return;
+
+   if (atomic_dec_return(>num_jobs[priority]) > 0)
+   return;
+
+   /* no need to restore if the job is already at the lowest priority */
+   if (priority == AMD_SCHED_PRIORITY_NORMAL)
+   return;
+
+   mutex_lock(>priority_mutex);
+   /* something higher prio is executing, no need to decay */
+   if (ring->priority > priority)
+   goto out_unlock;
+
+   /* decay priority to the next level with a job available */
+   for (i = priority; i >= AMD_SCHED_PRIORITY_MIN; i--) {
+   if (i == AMD_SCHED_PRIORITY_NORMAL
+   || atomic_read(>num_jobs[i])) {
+   ring->priority = i;
+   ring->funcs->set_priority(ring, i);
+   break;
+   }
+   }
+
+out_unlock:
+   mutex_unlock(>priority_mutex);
+}
+
+/**
+ * amdgpu_ring_priority_get - change the ring's priority
+ *
+ * @ring: amdgpu_ring structure holding the information
+ * @priority: target priority
+ *
+ * Request a ring's priority to be raised to @priority (refcounted).
+ 

Re: [PATCH 3/3] drm/amdgpu: add gtt_sys_limit

2017-06-26 Thread Felix Kuehling
I'm wondering what makes this possible. Let me quote the last discussion
we had about GART:

> On 17-04-04 06:26 PM, Felix Kuehling wrote:
>> Even with GART address space being allocated on demand, it still seems
>> to be limiting the maximum available system memory that can be allocated
>> from TTM. We have a test that allocates a bunch of 128MB buffers. On a
>> 32GB system memory system with a 4GB GPU it can only get 31 buffers, so
>> a bit under 4GB. Looks like BOs remain bound to GART after being
>> initialized or migrated to GTT. For KFD that limits the amount of usable
>> system memory, for amdgpu_cs, I think it limits the amount of system
>> memory that can be used in a single command submission.

> On 17-04-05 02:55 AM, Christian König wrote:
>>> Are these two effects intentional?
>> Yes, thought about dropping this as well but testing showed that it is
>> still necessary.
>>
>> The total amount of memory bound to the GPU must be limited by the
>> GART size or otherwise the swapping code won't work any more. E.g.
>> suspend/resume fails immediately if I remove that.
Has that changed? I don't remember seeing any changes to that effect.

Regards,
  Felix


On 17-06-26 09:39 AM, Christian König wrote:
> From: Christian König 
>
> Limit the size of the GART table for the system domain.
>
> This saves us a bunch of visible VRAM, but also limitates the maximum BO size 
> we can swap out.
>
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  | 6 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c| 8 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 6 --
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 6 +++---
>  7 files changed, 26 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index ab1dad2..a511029 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -76,6 +76,7 @@
>  extern int amdgpu_modeset;
>  extern int amdgpu_vram_limit;
>  extern int amdgpu_gart_size;
> +extern unsigned amdgpu_gart_sys_limit;
>  extern int amdgpu_moverate;
>  extern int amdgpu_benchmarking;
>  extern int amdgpu_testing;
> @@ -602,6 +603,7 @@ struct amdgpu_mc {
>   u64 mc_vram_size;
>   u64 visible_vram_size;
>   u64 gtt_size;
> + u64 gtt_sys_limit;
>   u64 gtt_start;
>   u64 gtt_end;
>   u64 vram_start;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 44484bb..b6edb83 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1122,6 +1122,12 @@ static void amdgpu_check_arguments(struct 
> amdgpu_device *adev)
>   }
>   }
>  
> + if (amdgpu_gart_sys_limit < 32) {
> + dev_warn(adev->dev, "gart sys limit (%d) too small\n",
> +  amdgpu_gart_sys_limit);
> + amdgpu_gart_sys_limit = 32;
> + }
> +
>   amdgpu_check_vm_size(adev);
>  
>   amdgpu_check_block_size(adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 5a1d794..907ae5e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -75,6 +75,7 @@
>  
>  int amdgpu_vram_limit = 0;
>  int amdgpu_gart_size = -1; /* auto */
> +unsigned amdgpu_gart_sys_limit = 256;
>  int amdgpu_moverate = -1; /* auto */
>  int amdgpu_benchmarking = 0;
>  int amdgpu_testing = 0;
> @@ -124,6 +125,9 @@ module_param_named(vramlimit, amdgpu_vram_limit, int, 
> 0600);
>  MODULE_PARM_DESC(gartsize, "Size of PCIE/IGP gart to setup in megabytes (32, 
> 64, etc., -1 = auto)");
>  module_param_named(gartsize, amdgpu_gart_size, int, 0600);
>  
> +MODULE_PARM_DESC(gartlimit, "GART limit for the system domain in megabytes 
> (default 256)");
> +module_param_named(gartlimit, amdgpu_gart_sys_limit, int, 0600);
> +
>  MODULE_PARM_DESC(moverate, "Maximum buffer migration rate in MB/s. (32, 64, 
> etc., -1=auto, 0=1=disabled)");
>  module_param_named(moverate, amdgpu_moverate, int, 0600);
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index 8877015..5c6a461 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -70,6 +70,9 @@ void amdgpu_gart_set_defaults(struct amdgpu_device *adev)
>   adev->mc.mc_vram_size);
>   else
>   adev->mc.gtt_size = (uint64_t)amdgpu_gart_size << 20;
> +
> + 

Re: [PATCH v2 00/14] improve the fb_setcmap helper

2017-06-26 Thread Dave Airlie
>
> I'm traveling and cannot make progress this week. The merge window is
> also real close so this series will therefore probably miss it unless
> something unexpected happens...

Don't worry you missed the merge window for drm already, we don't merge
things after -rc6. Please remember the Linus merge window is for maintainers
to merge stuff to Linus, things need to be ready long before it, and for drm
you shouldn't really tie yourself to the merge cycle.

Dave.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control use of ECC/EDC.

2017-06-26 Thread Bridgman, John
Agreed... one person's "best" is another person's "OMG I didn't want that". IMO 
we should have bits correspond to specific options as much as possible, modulo 
HW capabilities.

>-Original Message-
>From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of
>Xie, AlexBin
>Sent: Monday, June 26, 2017 2:12 PM
>To: Panariti, David; Deucher, Alexander; amd-gfx@lists.freedesktop.org
>Subject: RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control use
>of ECC/EDC.
>
>Hi,
>
>I have not checked the background of this discussion very closely yet. And you
>might have known the following.
>
>Customers may not want the default setting to change meaning. This is like an
>API.
>Example: The application and its environment is already set up and tested.
>Then if customer updates driver, suddenly driver has some new behavior?
>Certain serious application definitely does not accept this.
>
>IMHO, it is better to avoid vague concepts like "best". It will become rather
>difficult to define what is best when there are multiple customers with
>different requirements. Driver is to provide a feature or mechanism. "Best"
>sounds like a policy or a preference from driver side.
>
>In my pass work, I generally use default for two cases:
>1. The default is the most conservative option, which must work. Then the
>application can choose advanced features by choosing other parameter
>value/option.
>2. The default parameter is the compatible behavior before introducing this
>parameter/option.
>
>Regards,
>Alex Bin
>
>
>-Original Message-
>From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of
>Panariti, David
>Sent: Monday, June 26, 2017 12:06 PM
>To: Deucher, Alexander ; amd-
>g...@lists.freedesktop.org
>Subject: RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control use
>of ECC/EDC.
>
>>> > I'd suggest setting amdgpu_ecc_flags to AMD_ECC_SUPPORT_BEST by
>>> > default.  That can be our asic specific default setting.  In the
>>> > case of CZ, that will be disabled until we decide to enable EDC by 
>>> > default.
>>> [davep] I'm confused.  ECC...BEST will cause EDC to be enabled.
>>> I used ECC as the generic term for ECC and EDC, since ECC seems more
>>> basic (EDC is built on top of ECC).
>>> If I understand you, we can't do what you want with the current setup.
>
>>I'm saying we make ECC_BEST the default config (feel free to re-name if
>>ECC_DEFAULT).  Each asic can have a different default depending on what
>>features are ready.  So for CZ, we'd make ECC_BEST equivalent to
>>disabling ECC for now.  If a user wants to force it on, they can set ECC_EDC.
>Once EDC is stable on CZ, we can make ECC_BEST be equivalent to ECC_EDC.
>The way the default (ECC_BEST) always maps to the best available
>combination in that version of the driver.
>
>That's not how I meant it to work WRT BEST.
>Each asic will have a DEFAULT, but that isn't what BEST means.
>CZ is a good example (when fully implemented).  DEFAULT for CZ is everything
>except HALT, since, IMO opinion, most people do not want to hang or reboot.
>BEST for CZ would be everything a person most interested in reliability would
>want, which IMO, includes HALT/reboot.
>Similar is if something like performance degradation is really bad, DEFAULT
>would be OFF. BEST would be ON, e.g., if the user's app doesn't trigger the
>performance problem.
>The BEST bit is in a fixed position, so that customers don't need to worry what
>bits are needed for the most reliable performance (in our opinion) on a given
>asic.
>And if a customer (or developer) wants some arbitrary set of features, they
>can set bits as they want.
>
>I think DEFAULT will make most people happy.
>BEST allows people who are interested in everything they can get, regardless
>of any issues that brings with it. It is requested simply by using a fixed 
>param
>value (0x01) for any asic.
>This probably should not include features that have any kind of fatal flaw such
>as the Vega10 HBM ECC issue.  When fixed, it can be added to DEFAULT.
>And allowing per-feature control allows anyone to do precisely what they
>want.
>"Effort" increases as the number of interested users decreases.
>
>Using defines in the init code will be a problem if there is more than one kind
>of asic involved or a single asic that the user wants to use with different
>parameters.  However, this doesn't seem to be a high priority.
>If we do want to worry about it, then we'll need to put the values into the
>amdgpu_gfx struct.
>
>regards,
>davep
>
>> -Original Message-
>> From: Deucher, Alexander
>> Sent: Tuesday, June 06, 2017 6:16 PM
>> To: Panariti, David ; amd-
>> g...@lists.freedesktop.org
>> Subject: RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control
>> use of ECC/EDC.
>>
>> > -Original Message-
>> > From: Panariti, David
>> > Sent: Tuesday, June 06, 2017 5:50 PM
>> > To: Deucher, Alexander; amd-gfx@lists.freedesktop.org
>> > 

Re: [PATCH v2] drm/core: Fail atomic IOCTL with no CRTC state but with signaling.

2017-06-26 Thread Harry Wentland
On 2017-06-20 01:57 PM, Andrey Grodzovsky wrote:
> Problem : While running IGT kms_atomic_transition test suite i encountered
> a hang in drmHandleEvent immediately following  an atomic_commit.
> After dumping the atomic state I relized that in this case there was
> not even one CRTC attached to the state and only disabled
> planes. This probably due to a commit which hadn't changed any property
> which would require attaching crtc state. This means drmHandleEvent
> will never wake up from read since without CRTC in atomic state
> the event fd will not be signaled.
> 
> Fix: Protect against this issue by failing atomic_commit early in
> drm_mode_atomic_commit where such probelm can be identified.
> 
> v2:
> Fix typos and extra newlines.
> 
> Change-Id: I3ee28ffae35fd1e8bfe553146c44da53da02e6f8
> Signed-off-by: Andrey Grodzovsky 

Reviewed-by: Harry Wentland 

Harry

> ---
>  drivers/gpu/drm/drm_atomic.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> index a567310..48145bf 100644
> --- a/drivers/gpu/drm/drm_atomic.c
> +++ b/drivers/gpu/drm/drm_atomic.c
> @@ -1933,7 +1933,7 @@ static int prepare_crtc_signaling(struct drm_device 
> *dev,
>  {
>   struct drm_crtc *crtc;
>   struct drm_crtc_state *crtc_state;
> - int i, ret;
> + int i, c = 0, ret;
>  
>   if (arg->flags & DRM_MODE_ATOMIC_TEST_ONLY)
>   return 0;
> @@ -1994,8 +1994,17 @@ static int prepare_crtc_signaling(struct drm_device 
> *dev,
>  
>   crtc_state->event->base.fence = fence;
>   }
> +
> + c++;
>   }
>  
> + /*
> +  * Having this flag means user mode pends on event which will never
> +  * reach due to lack of at least one CRTC for signaling
> +  */
> + if (c == 0 && (arg->flags & DRM_MODE_PAGE_FLIP_EVENT))
> + return -EINVAL;
> +
>   return 0;
>  }
>  
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/2] drm/amdgpu: vm_update_ptes remove code duplication

2017-06-26 Thread Kasiviswanathan, Harish
Sorry for the delay. I have rebased and pushed my changes.

Best Regards,
Harish


-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian König
Sent: Monday, June 26, 2017 5:55 AM
To: Kuehling, Felix ; amd-gfx@lists.freedesktop.org; 
Kasiviswanathan, Harish 
Subject: Re: [PATCH 1/2] drm/amdgpu: vm_update_ptes remove code duplication

Hi Harish,

have you committed that to amd-staging-4.11? If not can you do so today?

I'm still getting compiler warnings from the VM code.

Regards,
Christian.

Am 23.06.2017 um 19:54 schrieb Felix Kuehling:
> Sorry for the delay.
>
> The series is Reviewed-by: Felix Kuehling .
>
> Regards,
>Felix
>
> On 17-06-13 02:24 PM, Christian König wrote:
>> Am 13.06.2017 um 19:07 schrieb Alex Deucher:
>>> On Fri, Jun 9, 2017 at 5:47 PM, Harish Kasiviswanathan 
>>>  wrote:
 Signed-off-by: Harish Kasiviswanathan 
 
>>> Series is:
>>> Acked-by: Alex Deucher 
>> Acked-by: Christian König .
>>
>>>
 ---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 73
 --
1 file changed, 16 insertions(+), 57 deletions(-)

 diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
 b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
 index c4f1a30..c308047 100644
 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
 +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
 @@ -1264,59 +1264,6 @@ static struct amdgpu_bo 
 *amdgpu_vm_get_pt(struct amdgpu_pte_update_params *p,
}

/**
 - * amdgpu_vm_update_ptes_cpu - Update the page tables in the range
 - *  start - @end using CPU.
 - * See amdgpu_vm_update_ptes for parameter description.
 - *
 - */
 -static int amdgpu_vm_update_ptes_cpu(struct 
 amdgpu_pte_update_params *params,
 -uint64_t start, uint64_t end,
 -uint64_t dst, uint64_t flags)
 -{
 -   struct amdgpu_device *adev = params->adev;
 -   const uint64_t mask = AMDGPU_VM_PTE_COUNT(adev) - 1;
 -   void *pe_ptr;
 -   uint64_t addr;
 -   struct amdgpu_bo *pt;
 -   unsigned int nptes;
 -   int r;
 -
 -   /* initialize the variables */
 -   addr = start;
 -
 -   /* walk over the address space and update the page tables */
 -   while (addr < end) {
 -   pt = amdgpu_vm_get_pt(params, addr);
 -   if (!pt) {
 -   pr_err("PT not found, aborting update_ptes\n");
 -   return -EINVAL;
 -   }
 -
 -   WARN_ON(params->shadow);
 -
 -   r = amdgpu_bo_kmap(pt, _ptr);
 -   if (r)
 -   return r;
 -
 -   pe_ptr += (addr & mask) * 8;
 -
 -   if ((addr & ~mask) == (end & ~mask))
 -   nptes = end - addr;
 -   else
 -   nptes = AMDGPU_VM_PTE_COUNT(adev) - (addr &
 mask);
 -
 -   params->func(params, (uint64_t)pe_ptr, dst, nptes,
 -AMDGPU_GPU_PAGE_SIZE, flags);
 -
 -   amdgpu_bo_kunmap(pt);
 -   addr += nptes;
 -   dst += nptes * AMDGPU_GPU_PAGE_SIZE;
 -   }
 -
 -   return 0;
 -}
 -
 -/**
 * amdgpu_vm_update_ptes - make sure that page tables are valid
 *
 * @params: see amdgpu_pte_update_params definition @@ -1339,10 
 +1286,9 @@ static int amdgpu_vm_update_ptes(struct 
 amdgpu_pte_update_params *params,
   uint64_t addr, pe_start;
   struct amdgpu_bo *pt;
   unsigned nptes;
 +   int r;
 +   bool use_cpu_update = (params->func == 
 + amdgpu_vm_cpu_set_ptes);

 -   if (params->func == amdgpu_vm_cpu_set_ptes)
 -   return amdgpu_vm_update_ptes_cpu(params, start, end,
 -dst, flags);

   /* walk over the address space and update the page tables */
   for (addr = start; addr < end; addr += nptes) { @@ 
 -1353,6 +1299,10 @@ static int amdgpu_vm_update_ptes(struct 
 amdgpu_pte_update_params *params,
   }

   if (params->shadow) {
 +   if (WARN_ONCE(use_cpu_update,
 +   "CPU VM update doesn't suuport
 shadow pages"))
 +   return 0;
 +
   if (!pt->shadow)
   return 0;
 

RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control use of ECC/EDC.

2017-06-26 Thread Xie, AlexBin
Hi,

I have not checked the background of this discussion very closely yet. And you 
might have known the following.

Customers may not want the default setting to change meaning. This is like an 
API.
Example: The application and its environment is already set up and tested. Then 
if customer updates driver, suddenly driver has some new behavior?
Certain serious application definitely does not accept this.

IMHO, it is better to avoid vague concepts like "best". It will become rather 
difficult to define what is best when there are multiple customers with 
different requirements. Driver is to provide a feature or mechanism. "Best" 
sounds like a policy or a preference from driver side.

In my pass work, I generally use default for two cases:
1. The default is the most conservative option, which must work. Then the 
application can choose advanced features by choosing other parameter 
value/option.
2. The default parameter is the compatible behavior before introducing this 
parameter/option.

Regards,
Alex Bin


-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Panariti, David
Sent: Monday, June 26, 2017 12:06 PM
To: Deucher, Alexander ; 
amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control use of 
ECC/EDC.

>> > I'd suggest setting amdgpu_ecc_flags to AMD_ECC_SUPPORT_BEST by 
>> > default.  That can be our asic specific default setting.  In the 
>> > case of CZ, that will be disabled until we decide to enable EDC by default.
>> [davep] I'm confused.  ECC...BEST will cause EDC to be enabled.
>> I used ECC as the generic term for ECC and EDC, since ECC seems more 
>> basic (EDC is built on top of ECC).
>> If I understand you, we can't do what you want with the current setup.

>I'm saying we make ECC_BEST the default config (feel free to re-name if 
>ECC_DEFAULT).  Each asic can have a different default depending 
>on what features are ready.  So for CZ, we'd make ECC_BEST equivalent to 
>disabling ECC for now.  If a user wants to force it on, they can 
>set ECC_EDC.  Once EDC is stable on CZ, we can make ECC_BEST be equivalent to 
>ECC_EDC.  The way the default (ECC_BEST) always maps 
>to the best available combination in that version of the driver.

That's not how I meant it to work WRT BEST.
Each asic will have a DEFAULT, but that isn't what BEST means.
CZ is a good example (when fully implemented).  DEFAULT for CZ is everything 
except HALT, since, IMO opinion, most people do not want to hang or reboot.
BEST for CZ would be everything a person most interested in reliability would 
want, which IMO, includes HALT/reboot.
Similar is if something like performance degradation is really bad, DEFAULT 
would be OFF. BEST would be ON, e.g., if the user's app doesn't trigger the 
performance problem.
The BEST bit is in a fixed position, so that customers don't need to worry what 
bits are needed for the most reliable performance (in our opinion) on a given 
asic.
And if a customer (or developer) wants some arbitrary set of features, they can 
set bits as they want.

I think DEFAULT will make most people happy.
BEST allows people who are interested in everything they can get, regardless of 
any issues that brings with it. It is requested simply by using a fixed param 
value (0x01) for any asic.
This probably should not include features that have any kind of fatal flaw such 
as the Vega10 HBM ECC issue.  When fixed, it can be added to DEFAULT.
And allowing per-feature control allows anyone to do precisely what they want.
"Effort" increases as the number of interested users decreases.

Using defines in the init code will be a problem if there is more than one kind 
of asic involved or a single asic that the user wants to use with different 
parameters.  However, this doesn't seem to be a high priority.
If we do want to worry about it, then we'll need to put the values into the 
amdgpu_gfx struct.

regards,
davep

> -Original Message-
> From: Deucher, Alexander
> Sent: Tuesday, June 06, 2017 6:16 PM
> To: Panariti, David ; amd-
> g...@lists.freedesktop.org
> Subject: RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control use
> of ECC/EDC.
> 
> > -Original Message-
> > From: Panariti, David
> > Sent: Tuesday, June 06, 2017 5:50 PM
> > To: Deucher, Alexander; amd-gfx@lists.freedesktop.org
> > Subject: RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control
> > use of ECC/EDC.
> >
> > Rather than inlining this in a number of places, Re verbosity:
> > I've worked in embedded environments and when dealing with
> > intermittent problems it's nice to have all of the information ASAP
> > rather than waiting for a problem to reoccur, especially if it's very
> intermittent.
> > I would've preferred more.
> > Since it only shows up happens on CZ, it adds little to the output.
> > I like to show the reasons why EDC didn't happen, hence the backwards
> > 

RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control use of ECC/EDC.

2017-06-26 Thread Panariti, David
>> > I'd suggest setting amdgpu_ecc_flags to AMD_ECC_SUPPORT_BEST by 
>> > default.  That can be our asic specific default setting.  In the 
>> > case of CZ, that will be disabled until we decide to enable EDC by default.
>> [davep] I'm confused.  ECC...BEST will cause EDC to be enabled.
>> I used ECC as the generic term for ECC and EDC, since ECC seems more 
>> basic (EDC is built on top of ECC).
>> If I understand you, we can't do what you want with the current setup.

>I'm saying we make ECC_BEST the default config (feel free to re-name if 
>ECC_DEFAULT).  Each asic can have a different default depending 
>on what features are ready.  So for CZ, we'd make ECC_BEST equivalent to 
>disabling ECC for now.  If a user wants to force it on, they can 
>set ECC_EDC.  Once EDC is stable on CZ, we can make ECC_BEST be equivalent to 
>ECC_EDC.  The way the default (ECC_BEST) always maps 
>to the best available combination in that version of the driver.

That's not how I meant it to work WRT BEST.
Each asic will have a DEFAULT, but that isn't what BEST means.
CZ is a good example (when fully implemented).  DEFAULT for CZ is everything 
except HALT, since, IMO opinion, most people do not want to hang or reboot.
BEST for CZ would be everything a person most interested in reliability would 
want, which IMO, includes HALT/reboot.
Similar is if something like performance degradation is really bad, DEFAULT 
would be OFF. BEST would be ON, e.g., if the user's app doesn't trigger the 
performance problem.
The BEST bit is in a fixed position, so that customers don't need to worry what 
bits are needed for the most reliable performance (in our opinion) on a given 
asic.
And if a customer (or developer) wants some arbitrary set of features, they can 
set bits as they want.

I think DEFAULT will make most people happy.
BEST allows people who are interested in everything they can get, regardless of 
any issues that brings with it. It is requested simply by using a fixed param 
value (0x01) for any asic.
This probably should not include features that have any kind of fatal flaw such 
as the Vega10 HBM ECC issue.  When fixed, it can be added to DEFAULT.
And allowing per-feature control allows anyone to do precisely what they want.
"Effort" increases as the number of interested users decreases.

Using defines in the init code will be a problem if there is more than one kind 
of asic involved or a single asic that the user wants to use with different 
parameters.  However, this doesn't seem to be a high priority.
If we do want to worry about it, then we'll need to put the values into the 
amdgpu_gfx struct.

regards,
davep

> -Original Message-
> From: Deucher, Alexander
> Sent: Tuesday, June 06, 2017 6:16 PM
> To: Panariti, David ; amd-
> g...@lists.freedesktop.org
> Subject: RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control use
> of ECC/EDC.
> 
> > -Original Message-
> > From: Panariti, David
> > Sent: Tuesday, June 06, 2017 5:50 PM
> > To: Deucher, Alexander; amd-gfx@lists.freedesktop.org
> > Subject: RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control
> > use of ECC/EDC.
> >
> > Rather than inlining this in a number of places, Re verbosity:
> > I've worked in embedded environments and when dealing with
> > intermittent problems it's nice to have all of the information ASAP
> > rather than waiting for a problem to reoccur, especially if it's very
> intermittent.
> > I would've preferred more.
> > Since it only shows up happens on CZ, it adds little to the output.
> > I like to show the reasons why EDC didn't happen, hence the backwards
> > looking messages.
> > In this particular case,  without the "... not requested..." we can't
> > tell if it was the flags or the ring being unready that made us bail 
> > earlier.
> 
> I'm fine with a message about EDC either being enabled or disabled, but a
> bunch of random debug statements along the way are too much.  They tend
> to just cause confusion and clutter up the logs.
> 
> >
> > > -Original Message-
> > > From: Deucher, Alexander
> > > Sent: Tuesday, June 06, 2017 5:22 PM
> > > To: Panariti, David ; amd-
> > > g...@lists.freedesktop.org
> > > Cc: Panariti, David 
> > > Subject: RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control
> > use
> > > of ECC/EDC.
> > >
> > > > -Original Message-
> > > > From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On
> > Behalf
> > > > Of David Panariti
> > > > Sent: Tuesday, June 06, 2017 4:33 PM
> > > > To: amd-gfx@lists.freedesktop.org
> > > > Cc: Panariti, David
> > > > Subject: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control
> > > > use of ECC/EDC.
> > > >
> > > > Allow various kinds of memory integrity methods (e.g. ECC/EDC) to
> > > > be enabled or disabled.  By default, all features are disabled.
> > > >
> > > > EDC is Error Detection and Correction.  It can detect ECC errors
> > > > and do 0 or 

Re: [PATCH 5/8] drm/amdgpu: move priority decay logic into amd_sched_priority_ctr

2017-06-26 Thread Andres Rodriguez



On 2017-06-26 09:32 AM, Christian König wrote:
Sorry for the delay, back from a rather long sick leave today and trying 
to catch up with my work.




Don't worry too much. Get well soon!

Quick ping on the query above. The query can be summarized as: which 
ioctl interface do we prefer for the override?


  * AMDGPU_SCHED_OP_PROCESS_PRIORITY_GET/PUT - refcounted override

or

  * AMDGPU_SCHED_OP_PROCESS_PRIORITY_SET - simple set


Please keep it simple and stupid. No extra fiddling here, that stuff is 
complicated enough.


So I would strongly suggest to have a simple set interface which the DRM 
master can use to set the priority.


If we have multiple DRM masters which feel responsible for this they 
should to sync up in userspace what to do.


Sounds good to me. I'll send out a _SET version of this series.

Regards,
Andres



Regards,
Christian.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control use of ECC/EDC.

2017-06-26 Thread Deucher, Alexander
> -Original Message-
> From: Panariti, David
> Sent: Monday, June 26, 2017 12:06 PM
> To: Deucher, Alexander; amd-gfx@lists.freedesktop.org
> Subject: RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control use
> of ECC/EDC.
> 
> >> > I'd suggest setting amdgpu_ecc_flags to AMD_ECC_SUPPORT_BEST by
> >> > default.  That can be our asic specific default setting.  In the
> >> > case of CZ, that will be disabled until we decide to enable EDC by
> default.
> >> [davep] I'm confused.  ECC...BEST will cause EDC to be enabled.
> >> I used ECC as the generic term for ECC and EDC, since ECC seems more
> >> basic (EDC is built on top of ECC).
> >> If I understand you, we can't do what you want with the current setup.
> 
> >I'm saying we make ECC_BEST the default config (feel free to re-name if
> ECC_DEFAULT).  Each asic can have a different default depending
> >on what features are ready.  So for CZ, we'd make ECC_BEST equivalent to
> disabling ECC for now.  If a user wants to force it on, they can
> >set ECC_EDC.  Once EDC is stable on CZ, we can make ECC_BEST be
> equivalent to ECC_EDC.  The way the default (ECC_BEST) always maps
> >to the best available combination in that version of the driver.
> 
> That's not how I meant it to work WRT BEST.
> Each asic will have a DEFAULT, but that isn't what BEST means.
> CZ is a good example (when fully implemented).  DEFAULT for CZ is
> everything except HALT, since, IMO opinion, most people do not want to
> hang or reboot.
> BEST for CZ would be everything a person most interested in reliability would
> want, which IMO, includes HALT/reboot.
> Similar is if something like performance degradation is really bad, DEFAULT
> would be OFF. BEST would be ON, e.g., if the user's app doesn't trigger the
> performance problem.
> The BEST bit is in a fixed position, so that customers don't need to worry
> what bits are needed for the most reliable performance (in our opinion) on a
> given asic.
> And if a customer (or developer) wants some arbitrary set of features, they
> can set bits as they want.
> 
> I think DEFAULT will make most people happy.
> BEST allows people who are interested in everything they can get, regardless
> of any issues that brings with it. It is requested simply by using a fixed 
> param
> value (0x01) for any asic.
> This probably should not include features that have any kind of fatal flaw
> such as the Vega10 HBM ECC issue.  When fixed, it can be added to DEFAULT.
> And allowing per-feature control allows anyone to do precisely what they
> want.
> "Effort" increases as the number of interested users decreases.
> 

I would like to be able to have different default settings on all asics without 
requiring explicitly setting the option for each chip.  If there is a best 
setting it should be the default.  BEST implies it's the best option so users 
will try it whether it makes sense or not.  If you want to enable something 
that is not the default, then you need to explicitly ask for it.  BEST could be 
defined as 0x since each asic would only look at the masked bits for 
the features it supports.  I would prefer not to call it BEST though.  Maybe 
ALL.

RAS_NONE 0
RAS_DEFAULT (1 << 0)
RAS_VRAM_ECC (1 << 1)
RAS_SRAM_EDC (1 << 2)
...
RAS_ALL 0x

Alex

> Using defines in the init code will be a problem if there is more than one 
> kind
> of asic involved or a single asic that the user wants to use with different
> parameters.  However, this doesn't seem to be a high priority.
> If we do want to worry about it, then we'll need to put the values into the
> amdgpu_gfx struct.
> 
> regards,
> davep
> 
> > -Original Message-
> > From: Deucher, Alexander
> > Sent: Tuesday, June 06, 2017 6:16 PM
> > To: Panariti, David ; amd-
> > g...@lists.freedesktop.org
> > Subject: RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control
> use
> > of ECC/EDC.
> >
> > > -Original Message-
> > > From: Panariti, David
> > > Sent: Tuesday, June 06, 2017 5:50 PM
> > > To: Deucher, Alexander; amd-gfx@lists.freedesktop.org
> > > Subject: RE: [PATCH 3/3] drm/amdgpu: Add kernel parameter to control
> > > use of ECC/EDC.
> > >
> > > Rather than inlining this in a number of places, Re verbosity:
> > > I've worked in embedded environments and when dealing with
> > > intermittent problems it's nice to have all of the information ASAP
> > > rather than waiting for a problem to reoccur, especially if it's very
> > intermittent.
> > > I would've preferred more.
> > > Since it only shows up happens on CZ, it adds little to the output.
> > > I like to show the reasons why EDC didn't happen, hence the backwards
> > > looking messages.
> > > In this particular case,  without the "... not requested..." we can't
> > > tell if it was the flags or the ring being unready that made us bail 
> > > earlier.
> >
> > I'm fine with a message about EDC either being enabled or disabled, but a
> > bunch of random debug statements along the way 

RE: [PATCH 2/3] drm/amdgpu: fix amdgpu_debugfs_gem_bo_info

2017-06-26 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Christian König
> Sent: Monday, June 26, 2017 9:40 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: j...@fastquake.com; Kuehling, Felix
> Subject: [PATCH 2/3] drm/amdgpu: fix amdgpu_debugfs_gem_bo_info
> 
> From: Christian König 
> 
> Otherwise we trigger a bunch of WARN_ONs when this is called.
> 
> Signed-off-by: Christian König 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index 621f739..96c4493 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -784,6 +784,7 @@ static int amdgpu_debugfs_gem_bo_info(int id, void
> *ptr, void *data)
>   unsigned domain;
>   const char *placement;
>   unsigned pin_count;
> + uint64_t offset;
> 
>   domain = amdgpu_mem_type_to_domain(bo-
> >tbo.mem.mem_type);
>   switch (domain) {
> @@ -798,9 +799,12 @@ static int amdgpu_debugfs_gem_bo_info(int id, void
> *ptr, void *data)
>   placement = " CPU";
>   break;
>   }
> - seq_printf(m, "\t0x%08x: %12ld byte %s @ 0x%010Lx",
> -id, amdgpu_bo_size(bo), placement,
> -amdgpu_bo_gpu_offset(bo));
> + seq_printf(m, "\t0x%08x: %12ld byte %s",
> +id, amdgpu_bo_size(bo), placement);
> +
> + offset = ACCESS_ONCE(bo->tbo.mem.start);
> + if (offset != AMDGPU_BO_INVALID_OFFSET)
> + seq_printf(m, " @ 0x%010Lx", offset);
> 
>   pin_count = ACCESS_ONCE(bo->pin_count);
>   if (pin_count)
> --
> 2.7.4
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/3] drm/amdgpu: cleanup initializing gtt_size

2017-06-26 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Christian König
> Sent: Monday, June 26, 2017 9:40 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: j...@fastquake.com; Kuehling, Felix
> Subject: [PATCH 1/3] drm/amdgpu: cleanup initializing gtt_size
> 
> From: Christian König 
> 
> Stop spreading the code over all GMC generations.
> 
> Signed-off-by: Christian König 

Reviewed-by: Alex Deucher 


> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h  |  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 20
> 
>  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c| 10 +-
>  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c| 10 +-
>  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c| 10 +-
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c| 10 +-
>  6 files changed, 25 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 7624294..ab1dad2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -561,6 +561,7 @@ struct amdgpu_gart {
>   const struct amdgpu_gart_funcs *gart_funcs;
>  };
> 
> +void amdgpu_gart_set_defaults(struct amdgpu_device *adev);
>  int amdgpu_gart_table_ram_alloc(struct amdgpu_device *adev);
>  void amdgpu_gart_table_ram_free(struct amdgpu_device *adev);
>  int amdgpu_gart_table_vram_alloc(struct amdgpu_device *adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index ccef3cf..8877015 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -52,6 +52,26 @@
>  /*
>   * Common GART table functions.
>   */
> +
> +/**
> + * amdgpu_gart_set_defaults - set the default gtt_size
> + *
> + * @adev: amdgpu_device pointer
> + *
> + * Set the default gtt_size based on parameters and available VRAM.
> + */
> +void amdgpu_gart_set_defaults(struct amdgpu_device *adev)
> +{
> + /* unless the user had overridden it, set the gart
> +  * size equal to the 1024 or vram, whichever is larger.
> +  */
> + if (amdgpu_gart_size == -1)
> + adev->mc.gtt_size =
> max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> + adev->mc.mc_vram_size);
> + else
> + adev->mc.gtt_size = (uint64_t)amdgpu_gart_size << 20;
> +}
> +
>  /**
>   * amdgpu_gart_table_ram_alloc - allocate system ram for gart page table
>   *
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> index 35655a1..5cc3f39 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> @@ -321,15 +321,7 @@ static int gmc_v6_0_mc_init(struct amdgpu_device
> *adev)
>   adev->mc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) *
> 1024ULL * 1024ULL;
>   adev->mc.visible_vram_size = adev->mc.aper_size;
> 
> - /* unless the user had overridden it, set the gart
> -  * size equal to the 1024 or vram, whichever is larger.
> -  */
> - if (amdgpu_gart_size == -1)
> - adev->mc.gtt_size =
> max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> - adev->mc.mc_vram_size);
> - else
> - adev->mc.gtt_size = (uint64_t)amdgpu_gart_size << 20;
> -
> + amdgpu_gart_set_defaults(adev);
>   gmc_v6_0_vram_gtt_location(adev, >mc);
> 
>   return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> index 83c21a1..15f2c0f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> @@ -373,15 +373,7 @@ static int gmc_v7_0_mc_init(struct amdgpu_device
> *adev)
>   if (adev->mc.visible_vram_size > adev->mc.real_vram_size)
>   adev->mc.visible_vram_size = adev->mc.real_vram_size;
> 
> - /* unless the user had overridden it, set the gart
> -  * size equal to the 1024 or vram, whichever is larger.
> -  */
> - if (amdgpu_gart_size == -1)
> - adev->mc.gtt_size =
> max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> - adev->mc.mc_vram_size);
> - else
> - adev->mc.gtt_size = (uint64_t)amdgpu_gart_size << 20;
> -
> + amdgpu_gart_set_defaults(adev);
>   gmc_v7_0_vram_gtt_location(adev, >mc);
> 
>   return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> index c92af24..213af65 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> @@ -535,15 +535,7 @@ static int gmc_v8_0_mc_init(struct amdgpu_device
> *adev)
>   if (adev->mc.visible_vram_size > adev->mc.real_vram_size)
>   adev->mc.visible_vram_size = adev->mc.real_vram_size;
> 
> - /* unless the user had overridden it, set the gart
> -  * size 

RE: [PATCH] drm/amdgpu: fix vulkan test performance drop and hang on VI

2017-06-26 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Rex Zhu
> Sent: Monday, June 26, 2017 3:26 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhu, Rex
> Subject: [PATCH] drm/amdgpu: fix vulkan test performance drop and hang
> on VI
> 
> caused by not program dynamic_cu_mask_addr in the KIQ MQD.
> 
> v2: create struct vi_mqd_allocation in FB which will contain
> 1. PM4 MQD structure.
> 2. Write Pointer Poll Memory.
> 3. Read Pointer Report Memory
> 4. Dynamic CU Mask.
> 5. Dynamic RB Mask.
> 
> Change-Id: I22c840f1bf8d365f7df33a27d6b11e1aea8f2958
> Signed-off-by: Rex Zhu 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c|  27 ++--
>  drivers/gpu/drm/amd/include/vi_structs.h | 268
> +++
>  2 files changed, 285 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> index 1a75ab1..452cc5b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> @@ -40,7 +40,6 @@
> 
>  #include "bif/bif_5_0_d.h"
>  #include "bif/bif_5_0_sh_mask.h"
> -
>  #include "gca/gfx_8_0_d.h"
>  #include "gca/gfx_8_0_enum.h"
>  #include "gca/gfx_8_0_sh_mask.h"
> @@ -2100,7 +2099,7 @@ static int gfx_v8_0_sw_init(void *handle)
>   return r;
> 
>   /* create MQD for all compute queues as well as KIQ for SRIOV case
> */
> - r = amdgpu_gfx_compute_mqd_sw_init(adev, sizeof(struct
> vi_mqd));
> + r = amdgpu_gfx_compute_mqd_sw_init(adev, sizeof(struct
> vi_mqd_allocation));
>   if (r)
>   return r;
> 
> @@ -4715,9 +4714,6 @@ static int gfx_v8_0_mqd_init(struct amdgpu_ring
> *ring)
>   uint64_t hqd_gpu_addr, wb_gpu_addr, eop_base_addr;
>   uint32_t tmp;
> 
> - /* init the mqd struct */
> - memset(mqd, 0, sizeof(struct vi_mqd));
> -
>   mqd->header = 0xC0310800;
>   mqd->compute_pipelinestat_enable = 0x0001;
>   mqd->compute_static_thread_mgmt_se0 = 0x;
> @@ -4725,7 +4721,12 @@ static int gfx_v8_0_mqd_init(struct amdgpu_ring
> *ring)
>   mqd->compute_static_thread_mgmt_se2 = 0x;
>   mqd->compute_static_thread_mgmt_se3 = 0x;
>   mqd->compute_misc_reserved = 0x0003;
> -
> + if (!(adev->flags & AMD_IS_APU)) {
> + mqd->dynamic_cu_mask_addr_lo = lower_32_bits(ring-
> >mqd_gpu_addr
> +  + offsetof(struct
> vi_mqd_allocation, dyamic_cu_mask));
> + mqd->dynamic_cu_mask_addr_hi = upper_32_bits(ring-
> >mqd_gpu_addr
> +  + offsetof(struct
> vi_mqd_allocation, dyamic_cu_mask));
> + }
>   eop_base_addr = ring->eop_gpu_addr >> 8;
>   mqd->cp_hqd_eop_base_addr_lo = eop_base_addr;
>   mqd->cp_hqd_eop_base_addr_hi =
> upper_32_bits(eop_base_addr);
> @@ -4900,7 +4901,7 @@ static int gfx_v8_0_kiq_init_queue(struct
> amdgpu_ring *ring)
>   if (adev->gfx.in_reset) { /* for GPU_RESET case */
>   /* reset MQD to a clean status */
>   if (adev->gfx.mec.mqd_backup[mqd_idx])
> - memcpy(mqd, adev-
> >gfx.mec.mqd_backup[mqd_idx], sizeof(*mqd));
> + memcpy(mqd, adev-
> >gfx.mec.mqd_backup[mqd_idx], sizeof(struct vi_mqd_allocation));
> 
>   /* reset ring buffer */
>   ring->wptr = 0;
> @@ -4916,6 +4917,9 @@ static int gfx_v8_0_kiq_init_queue(struct
> amdgpu_ring *ring)
>   vi_srbm_select(adev, 0, 0, 0, 0);
>   mutex_unlock(>srbm_mutex);
>   } else {
> + memset((void *)mqd, 0, sizeof(struct vi_mqd_allocation));
> + ((struct vi_mqd_allocation *)mqd)->dyamic_cu_mask =
> 0x;
> + ((struct vi_mqd_allocation *)mqd)->dyamic_rb_mask =
> 0x;
>   mutex_lock(>srbm_mutex);
>   vi_srbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
>   gfx_v8_0_mqd_init(ring);
> @@ -4929,7 +4933,7 @@ static int gfx_v8_0_kiq_init_queue(struct
> amdgpu_ring *ring)
>   mutex_unlock(>srbm_mutex);
> 
>   if (adev->gfx.mec.mqd_backup[mqd_idx])
> - memcpy(adev->gfx.mec.mqd_backup[mqd_idx],
> mqd, sizeof(*mqd));
> + memcpy(adev->gfx.mec.mqd_backup[mqd_idx],
> mqd, sizeof(struct vi_mqd_allocation));
>   }
> 
>   return r;
> @@ -4947,6 +4951,9 @@ static int gfx_v8_0_kcq_init_queue(struct
> amdgpu_ring *ring)
>   int mqd_idx = ring - >gfx.compute_ring[0];
> 
>   if (!adev->gfx.in_reset && !adev->gfx.in_suspend) {
> + memset((void *)mqd, 0, sizeof(struct vi_mqd_allocation));
> + ((struct vi_mqd_allocation *)mqd)->dyamic_cu_mask =
> 0x;
> + ((struct vi_mqd_allocation *)mqd)->dyamic_rb_mask =
> 0x;
>   mutex_lock(>srbm_mutex);
>   vi_srbm_select(adev, 

RE: [PATCH v2] drm/amdgpu: fix solid screen and hard hang when 2k + 4k display connected with DP

2017-06-26 Thread Deucher, Alexander
> -Original Message-
> From: Huang Rui [mailto:ray.hu...@amd.com]
> Sent: Monday, June 26, 2017 8:18 AM
> To: amd-gfx@lists.freedesktop.org; Deucher, Alexander; Koenig, Christian
> Cc: Wang, Ken; Qiao, Joe(Markham); Jiang, Sonny; Huan, Alvin; Huang, Ray
> Subject: [PATCH v2] drm/amdgpu: fix solid screen and hard hang when 2k +
> 4k display connected with DP
> 
> Signed-off-by: Huang Rui 

Acked-by: Alex Deucher 

> ---
> 
> V1 -> V2:
> - Add READ_ONCE to avoid complier might cause to read as previous fence
> value.
> 
> I will refine fence checking with fence interface at amdgpu_fence.c after S3
> stress issue completed as Christian's suggestion.
> 
> Thanks,
> Ray
> 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index 5bed483..af3c87b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -130,7 +130,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   ret = psp_cmd_submit(psp, ucode, cmd_buf_mc_addr,
>fence_mc_addr, index);
> 
> - while (*((unsigned int *)psp->fence_buf) != index) {
> + while (READ_ONCE(*((unsigned int *)psp->fence_buf)) < index) {
>   msleep(1);
>   }
> 
> --
> 2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/3] drm/amdgpu: fix amdgpu_debugfs_gem_bo_info

2017-06-26 Thread Christian König
From: Christian König 

Otherwise we trigger a bunch of WARN_ONs when this is called.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 621f739..96c4493 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -784,6 +784,7 @@ static int amdgpu_debugfs_gem_bo_info(int id, void *ptr, 
void *data)
unsigned domain;
const char *placement;
unsigned pin_count;
+   uint64_t offset;
 
domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
switch (domain) {
@@ -798,9 +799,12 @@ static int amdgpu_debugfs_gem_bo_info(int id, void *ptr, 
void *data)
placement = " CPU";
break;
}
-   seq_printf(m, "\t0x%08x: %12ld byte %s @ 0x%010Lx",
-  id, amdgpu_bo_size(bo), placement,
-  amdgpu_bo_gpu_offset(bo));
+   seq_printf(m, "\t0x%08x: %12ld byte %s",
+  id, amdgpu_bo_size(bo), placement);
+
+   offset = ACCESS_ONCE(bo->tbo.mem.start);
+   if (offset != AMDGPU_BO_INVALID_OFFSET)
+   seq_printf(m, " @ 0x%010Lx", offset);
 
pin_count = ACCESS_ONCE(bo->pin_count);
if (pin_count)
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/3] drm/amdgpu: cleanup initializing gtt_size

2017-06-26 Thread Christian König
From: Christian König 

Stop spreading the code over all GMC generations.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h  |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 20 
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c| 10 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c| 10 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c| 10 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c| 10 +-
 6 files changed, 25 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 7624294..ab1dad2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -561,6 +561,7 @@ struct amdgpu_gart {
const struct amdgpu_gart_funcs *gart_funcs;
 };
 
+void amdgpu_gart_set_defaults(struct amdgpu_device *adev);
 int amdgpu_gart_table_ram_alloc(struct amdgpu_device *adev);
 void amdgpu_gart_table_ram_free(struct amdgpu_device *adev);
 int amdgpu_gart_table_vram_alloc(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index ccef3cf..8877015 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -52,6 +52,26 @@
 /*
  * Common GART table functions.
  */
+
+/**
+ * amdgpu_gart_set_defaults - set the default gtt_size
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Set the default gtt_size based on parameters and available VRAM.
+ */
+void amdgpu_gart_set_defaults(struct amdgpu_device *adev)
+{
+   /* unless the user had overridden it, set the gart
+* size equal to the 1024 or vram, whichever is larger.
+*/
+   if (amdgpu_gart_size == -1)
+   adev->mc.gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
+   adev->mc.mc_vram_size);
+   else
+   adev->mc.gtt_size = (uint64_t)amdgpu_gart_size << 20;
+}
+
 /**
  * amdgpu_gart_table_ram_alloc - allocate system ram for gart page table
  *
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index 35655a1..5cc3f39 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -321,15 +321,7 @@ static int gmc_v6_0_mc_init(struct amdgpu_device *adev)
adev->mc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
adev->mc.visible_vram_size = adev->mc.aper_size;
 
-   /* unless the user had overridden it, set the gart
-* size equal to the 1024 or vram, whichever is larger.
-*/
-   if (amdgpu_gart_size == -1)
-   adev->mc.gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
-   adev->mc.mc_vram_size);
-   else
-   adev->mc.gtt_size = (uint64_t)amdgpu_gart_size << 20;
-
+   amdgpu_gart_set_defaults(adev);
gmc_v6_0_vram_gtt_location(adev, >mc);
 
return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index 83c21a1..15f2c0f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -373,15 +373,7 @@ static int gmc_v7_0_mc_init(struct amdgpu_device *adev)
if (adev->mc.visible_vram_size > adev->mc.real_vram_size)
adev->mc.visible_vram_size = adev->mc.real_vram_size;
 
-   /* unless the user had overridden it, set the gart
-* size equal to the 1024 or vram, whichever is larger.
-*/
-   if (amdgpu_gart_size == -1)
-   adev->mc.gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
-   adev->mc.mc_vram_size);
-   else
-   adev->mc.gtt_size = (uint64_t)amdgpu_gart_size << 20;
-
+   amdgpu_gart_set_defaults(adev);
gmc_v7_0_vram_gtt_location(adev, >mc);
 
return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index c92af24..213af65 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -535,15 +535,7 @@ static int gmc_v8_0_mc_init(struct amdgpu_device *adev)
if (adev->mc.visible_vram_size > adev->mc.real_vram_size)
adev->mc.visible_vram_size = adev->mc.real_vram_size;
 
-   /* unless the user had overridden it, set the gart
-* size equal to the 1024 or vram, whichever is larger.
-*/
-   if (amdgpu_gart_size == -1)
-   adev->mc.gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
-   adev->mc.mc_vram_size);
-   else
-   adev->mc.gtt_size = (uint64_t)amdgpu_gart_size << 20;
-
+   amdgpu_gart_set_defaults(adev);
gmc_v8_0_vram_gtt_location(adev, >mc);
 
return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c

Re: [PATCH 5/8] drm/amdgpu: move priority decay logic into amd_sched_priority_ctr

2017-06-26 Thread Christian König
Sorry for the delay, back from a rather long sick leave today and trying 
to catch up with my work.


Quick ping on the query above. The query can be summarized as: which 
ioctl interface do we prefer for the override?


  * AMDGPU_SCHED_OP_PROCESS_PRIORITY_GET/PUT - refcounted override

or

  * AMDGPU_SCHED_OP_PROCESS_PRIORITY_SET - simple set


Please keep it simple and stupid. No extra fiddling here, that stuff is 
complicated enough.


So I would strongly suggest to have a simple set interface which the DRM 
master can use to set the priority.


If we have multiple DRM masters which feel responsible for this they 
should to sync up in userspace what to do.


Regards,
Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/8] drm/amd/amdgpu: Added asic_type as ACP DMA driver platform data

2017-06-26 Thread Christian König

Am 23.06.2017 um 19:05 schrieb Alex Deucher:

On Fri, Jun 23, 2017 at 12:43 PM, Christian König
 wrote:

Am 23.06.2017 um 18:34 schrieb Alex Deucher:

From: Vijendar Mukunda 

asic_type information is passed to ACP DMA Driver as platform data.
We need this to determine whether the asic is Carrizo (CZ) or
Stoney (ST) in the acp sound driver.

Reviewed-by: Alex Deucher 
Signed-off-by: Vijendar Mukunda 
Signed-off-by: Alex Deucher 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c | 4 
   sound/soc/amd/acp-pcm-dma.c | 8 ++--
   sound/soc/amd/acp.h | 7 +++
   3 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c
index 06879d1..7a2a765 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c
@@ -262,6 +262,7 @@ static int acp_hw_init(void *handle)
 uint64_t acp_base;
 struct device *dev;
 struct i2s_platform_data *i2s_pdata;
+   u32 asic_type;
 struct amdgpu_device *adev = (struct amdgpu_device *)handle;
   @@ -271,6 +272,7 @@ static int acp_hw_init(void *handle)
 if (!ip_block)
 return -EINVAL;
   + asic_type = adev->asic_type;
 r = amd_acp_hw_init(adev->acp.cgs_device,
 ip_block->version->major,
ip_block->version->minor);
 /* -ENODEV means board uses AZ rather than ACP */
@@ -355,6 +357,8 @@ static int acp_hw_init(void *handle)
 adev->acp.acp_cell[0].name = "acp_audio_dma";
 adev->acp.acp_cell[0].num_resources = 4;
 adev->acp.acp_cell[0].resources = >acp.acp_res[0];
+   adev->acp.acp_cell[0].platform_data = _type;
+   adev->acp.acp_cell[0].pdata_size = sizeof(asic_type);


Have the painkillers jellyfied my brain or are you setting a pointer in the
amdgpu device structure to some variable on the stack?

If it's just for initialization then that's probably ok, but I would reset
the pointer at the end of the function.

Otherwise somebody might have the idea to dereference it later on and that
can only lead to trouble.

I think it's ok because the hotplug of the audio device is triggered
from this function, so the audio device should be probed by the time
this variable goes out of scope.  That said, it would probably be
better to use the asic_type from the GPU driver instance directly
rather than a stack variable.


Yeah, agree. But keeping a stale reference in to a stack variable in the 
driver struct is still ugly like hell.


At bare minimum set this to NULL at the end of the function, or even 
better use a intptr_t to encode the asic type into the pointer.


Christian.



Alex



Christian.



 adev->acp.acp_cell[1].name = "designware-i2s";
 adev->acp.acp_cell[1].num_resources = 1;
diff --git a/sound/soc/amd/acp-pcm-dma.c b/sound/soc/amd/acp-pcm-dma.c
index 08b1399..dcbf997 100644
--- a/sound/soc/amd/acp-pcm-dma.c
+++ b/sound/soc/amd/acp-pcm-dma.c
@@ -73,12 +73,6 @@ static const struct snd_pcm_hardware
acp_pcm_hardware_capture = {
 .periods_max = CAPTURE_MAX_NUM_PERIODS,
   };
   -struct audio_drv_data {
-   struct snd_pcm_substream *play_stream;
-   struct snd_pcm_substream *capture_stream;
-   void __iomem *acp_mmio;
-};
-
   static u32 acp_reg_read(void __iomem *acp_mmio, u32 reg)
   {
 return readl(acp_mmio + (reg * 4));
@@ -916,6 +910,7 @@ static int acp_audio_probe(struct platform_device
*pdev)
 int status;
 struct audio_drv_data *audio_drv_data;
 struct resource *res;
+   const u32 *pdata = pdev->dev.platform_data;
 audio_drv_data = devm_kzalloc(>dev, sizeof(struct
audio_drv_data),
 GFP_KERNEL);
@@ -932,6 +927,7 @@ static int acp_audio_probe(struct platform_device
*pdev)
 audio_drv_data->play_stream = NULL;
 audio_drv_data->capture_stream = NULL;
+   audio_drv_data->asic_type =  *pdata;
 res = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
 if (!res) {
diff --git a/sound/soc/amd/acp.h b/sound/soc/amd/acp.h
index 330832e..28cf914 100644
--- a/sound/soc/amd/acp.h
+++ b/sound/soc/amd/acp.h
@@ -84,6 +84,13 @@ struct audio_substream_data {
 void __iomem *acp_mmio;
   };
   +struct audio_drv_data {
+   struct snd_pcm_substream *play_stream;
+   struct snd_pcm_substream *capture_stream;
+   void __iomem *acp_mmio;
+   u32 asic_type;
+};
+
   enum {
 ACP_TILE_P1 = 0,
 ACP_TILE_P2,





___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v2] drm/amdgpu: fix solid screen and hard hang when 2k + 4k display connected with DP

2017-06-26 Thread Christian König

Am 26.06.2017 um 14:18 schrieb Huang Rui:

Signed-off-by: Huang Rui 


Acked-by: Christian König .


---

V1 -> V2:
- Add READ_ONCE to avoid complier might cause to read as previous fence value.

I will refine fence checking with fence interface at amdgpu_fence.c after S3
stress issue completed as Christian's suggestion.

Thanks,
Ray

---
  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 5bed483..af3c87b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -130,7 +130,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
ret = psp_cmd_submit(psp, ucode, cmd_buf_mc_addr,
 fence_mc_addr, index);
  
-	while (*((unsigned int *)psp->fence_buf) != index) {

+   while (READ_ONCE(*((unsigned int *)psp->fence_buf)) < index) {
msleep(1);
}
  



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v2] drm/amdgpu: fix solid screen and hard hang when 2k + 4k display connected with DP

2017-06-26 Thread Huang Rui
Signed-off-by: Huang Rui 
---

V1 -> V2:
- Add READ_ONCE to avoid complier might cause to read as previous fence value.

I will refine fence checking with fence interface at amdgpu_fence.c after S3
stress issue completed as Christian's suggestion.

Thanks,
Ray

---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 5bed483..af3c87b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -130,7 +130,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
ret = psp_cmd_submit(psp, ucode, cmd_buf_mc_addr,
 fence_mc_addr, index);
 
-   while (*((unsigned int *)psp->fence_buf) != index) {
+   while (READ_ONCE(*((unsigned int *)psp->fence_buf)) < index) {
msleep(1);
}
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix solid screen and hard hang when 2k + 4k display connected with DP

2017-06-26 Thread Christian König

Am 26.06.2017 um 12:53 schrieb Huang Rui:

On Mon, Jun 26, 2017 at 03:18:52PM +0800, Christian König wrote:

Am 26.06.2017 um 09:03 schrieb Huang Rui:

Signed-off-by: Huang Rui 
Cc: Xiaojie Yuan 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/

amdgpu/amdgpu_psp.c

index 5bed483..60d0455 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -130,7 +130,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
ret = psp_cmd_submit(psp, ucode, cmd_buf_mc_addr,
 fence_mc_addr, index);
  
- while (*((unsigned int *)psp->fence_buf) != index) {

+ while (*((unsigned int *)psp->fence_buf) < index) {

That is still horrible incorrect. At bare minimum you need to use the
READ_ONCE() macro here.


So, READ_ONCE(*((unsigned int *)psp->fence_buf)) < index.
Do you mean the complier might cause to read the previous fence buffer
value if we don't add READ_ONCE?


Yes, exactly. I would guess that this only works by coincident because 
the msleep() is causing a barrier here.


Christian.




But it would be even better to completely drop this and use the fence
implementation in amdgpu_fence.c as well.


I understand, I will refactor it after S3 stress completed.

Thanks,
Ray



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix solid screen and hard hang when 2k + 4k display connected with DP

2017-06-26 Thread Huang Rui
On Mon, Jun 26, 2017 at 03:18:52PM +0800, Christian König wrote:
> Am 26.06.2017 um 09:03 schrieb Huang Rui:
> > Signed-off-by: Huang Rui 
> > Cc: Xiaojie Yuan 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/
> amdgpu/amdgpu_psp.c
> > index 5bed483..60d0455 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > @@ -130,7 +130,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
> >ret = psp_cmd_submit(psp, ucode, cmd_buf_mc_addr,
> > fence_mc_addr, index);
> >  
> > - while (*((unsigned int *)psp->fence_buf) != index) {
> > + while (*((unsigned int *)psp->fence_buf) < index) {
> 
> That is still horrible incorrect. At bare minimum you need to use the
> READ_ONCE() macro here.
> 

So, READ_ONCE(*((unsigned int *)psp->fence_buf)) < index.
Do you mean the complier might cause to read the previous fence buffer
value if we don't add READ_ONCE?

> But it would be even better to completely drop this and use the fence
> implementation in amdgpu_fence.c as well.
> 

I understand, I will refactor it after S3 stress completed.

Thanks,
Ray
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/9] drm/amdgpu: Add vis_vramlimit module parameter

2017-06-26 Thread Christian König

Am 23.06.2017 um 19:39 schrieb John Brooks:

Allow specifying a limit on visible VRAM via a module parameter. This is
helpful for testing performance under visible VRAM pressure.

Signed-off-by: John Brooks 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 8 
  3 files changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index c407d2d..fe73633 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -74,6 +74,7 @@
   */
  extern int amdgpu_modeset;
  extern int amdgpu_vram_limit;
+extern int amdgpu_vis_vram_limit;
  extern int amdgpu_gart_size;
  extern int amdgpu_moverate;
  extern int amdgpu_benchmarking;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 4c7c262..a14f458 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -73,6 +73,7 @@
  #define KMS_DRIVER_PATCHLEVEL 0
  
  int amdgpu_vram_limit = 0;

+int amdgpu_vis_vram_limit = 0;
  int amdgpu_gart_size = -1; /* auto */
  int amdgpu_moverate = -1; /* auto */
  int amdgpu_benchmarking = 0;
@@ -119,6 +120,9 @@ int amdgpu_lbpw = -1;
  MODULE_PARM_DESC(vramlimit, "Restrict VRAM for testing, in megabytes");
  module_param_named(vramlimit, amdgpu_vram_limit, int, 0600);
  
+MODULE_PARM_DESC(vis_vramlimit, "Restrict visible VRAM for testing, in megabytes");

+module_param_named(vis_vramlimit, amdgpu_vis_vram_limit, int, 0444);
+
  MODULE_PARM_DESC(gartsize, "Size of PCIE/IGP gart to setup in megabytes (32, 64, 
etc., -1 = auto)");
  module_param_named(gartsize, amdgpu_gart_size, int, 0600);
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

index c9b131b..0676a78 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1095,6 +1095,7 @@ static struct ttm_bo_driver amdgpu_bo_driver = {
  int amdgpu_ttm_init(struct amdgpu_device *adev)
  {
int r;
+   u64 vis_vram_limit;
  
  	r = amdgpu_ttm_global_init(adev);

if (r) {
@@ -1118,6 +1119,13 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
DRM_ERROR("Failed initializing VRAM heap.\n");
return r;
}
+
+   /* Reduce size of CPU-visible VRAM if requested */
+   vis_vram_limit = amdgpu_vis_vram_limit * 1024 * 1024;


You need a 64bit cast here, otherwise you can get an overrun.

Apart from that the patch is Reviewed-by: Christian König 
.


Regards,
Christian.


+   if (amdgpu_vis_vram_limit > 0 &&
+   vis_vram_limit <= adev->mc.visible_vram_size)
+   adev->mc.visible_vram_size = vis_vram_limit;
+
/* Change the size here instead of the init above so only lpfn is 
affected */
amdgpu_ttm_set_active_vram_size(adev, adev->mc.visible_vram_size);
  



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/9] drm/amdgpu: Add vis_vramlimit module parameter

2017-06-26 Thread Michel Dänzer
On 24/06/17 02:39 AM, John Brooks wrote:
> Allow specifying a limit on visible VRAM via a module parameter. This is
> helpful for testing performance under visible VRAM pressure.
> 
> Signed-off-by: John Brooks 

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 3/9] drm/amdgpu: Don't force BOs into visible VRAM for page faults

2017-06-26 Thread Michel Dänzer
On 24/06/17 02:39 AM, John Brooks wrote:
> There is no need for page faults to force BOs into visible VRAM if it's
> full, and the time it takes to do so is great enough to cause noticeable
> stuttering. Add GTT as a possible placement so that if visible VRAM is
> full, page faults move BOs to GTT instead of evicting other BOs from VRAM.
> 
> Signed-off-by: John Brooks 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 14 ++
>  1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 244a7d3..751bc05 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -981,10 +981,11 @@ int amdgpu_bo_fault_reserve_notify(struct 
> ttm_buffer_object *bo)
>  
>   /* hurrah the memory is not visible ! */
>   atomic64_inc(>num_vram_cpu_page_faults);
> - amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM);
> + amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM |
> +  AMDGPU_GEM_DOMAIN_GTT);
>   lpfn =  adev->mc.visible_vram_size >> PAGE_SHIFT;
>   for (i = 0; i < abo->placement.num_placement; i++) {
> - /* Force into visible VRAM */
> + /* Try to move the BO into visible VRAM */
>   if ((abo->placements[i].flags & TTM_PL_FLAG_VRAM) &&
>   (!abo->placements[i].lpfn ||
>abo->placements[i].lpfn > lpfn))
> @@ -993,16 +994,13 @@ int amdgpu_bo_fault_reserve_notify(struct 
> ttm_buffer_object *bo)
>   abo->placement.busy_placement = abo->placement.placement;
>   abo->placement.num_busy_placement = abo->placement.num_placement;
>   r = ttm_bo_validate(bo, >placement, false, false);
> - if (unlikely(r == -ENOMEM)) {
> - amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_GTT);
> - return ttm_bo_validate(bo, >placement, false, false);
> - } else if (unlikely(r != 0)) {
> + if (unlikely(r != 0))
>   return r;
> - }
>  
>   offset = bo->mem.start << PAGE_SHIFT;
>   /* this should never happen */
> - if ((offset + size) > adev->mc.visible_vram_size)
> + if (bo->mem.mem_type == TTM_PL_VRAM &&
> + (offset + size) > adev->mc.visible_vram_size)
>   return -EINVAL;
>  
>   return 0;
> 

AFAICT this is essentially the same as
https://patchwork.freedesktop.org/patch/156993/ . I retracted that patch
due to it causing Rally Dirt performance degradation for Marek.
Presumably other patches in this series compensate for that, but at the
least this patch should be moved to the end of the series.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v2 00/14] improve the fb_setcmap helper

2017-06-26 Thread Daniel Vetter
On Thu, Jun 22, 2017 at 08:06:23AM +0200, Peter Rosin wrote:
> Hi!
> 
> While trying to get CLUT support for the atmel_hlcdc driver, and
> specifically for the emulated fbdev interface, I received some
> push-back that my feeble in-driver attempts should be solved
> by the core. This is my attempt to do it right.
> 
> I have obviously not tested all of this with more than a compile,
> but patches 1 and 3 are enough to make the atmel-hlcdc driver
> do what I need (when patched to support CLUT modes). The rest is
> just lots of removals and cleanup made possible by the improved
> core.
> 
> Please test, I would not be surprised if I have fouled up some
> bit-manipulation somewhere in this mostly mechanical change...
> 
> Changes since v1:
> 
> - Rebased to next-20170621
> - Split 1/11 into a preparatory patch, a cleanup patch and then
>   the meat in 3/14.
> - Handle pseudo-palette for FB_VISUAL_TRUECOLOR.
> - Removed the empty .gamma_get/.gamma_set fb helpers from the
>   armada driver that I had somehow managed to ignore but which
>   0day found real quick.
> - Be less judgemental on drivers only providing .gamma_get and
>   .gamma_set, but no .load_lut. That's actually a valid thing
>   to do if you only need pseudo-palette for FB_VISUAL_TRUECOLOR.
> - Add a comment about colliding bitfields in the nouveau driver.
> - Remove gamma_set/gamma_get declarations from the radeon driver
>   (the definitions were removed in v1).

Ok some nits/questions on the first three, but in principle looks all ok I
think. The driver patches also look good (but I didn't yet carefully
review all the conversion). What we might want to do is entirely remove
driver's reliance on ->gamma_store (mostly amounts to in-lining the
load_lut functions) and only update ->gamma_store after gamma_set returned
successfully. But that's a bit more work.

Save/restoring it instead might be simpler to fix that bug, but since it's
pre-existing also ok as follow-up.

Cheers, Daniel

> 
> Cheers,
> peda
> 
> Peter Rosin (14):
>   drm/fb-helper: keep the .gamma_store updated in drm_fb_helper_setcmap
>   drm/fb-helper: remove drm_fb_helper_save_lut_atomic
>   drm/fb-helper: do a generic fb_setcmap helper in terms of crtc
> .gamma_set
>   drm: amd: remove dead code and pointless local lut storage
>   drm: armada: remove dead empty functions
>   drm: ast: remove dead code and pointless local lut storage
>   drm: cirrus: remove dead code and pointless local lut storage
>   drm: gma500: remove dead code and pointless local lut storage
>   drm: i915: remove dead code and pointless local lut storage
>   drm: mgag200: remove dead code and pointless local lut storage
>   drm: nouveau: remove dead code and pointless local lut storage
>   drm: radeon: remove dead code and pointless local lut storage
>   drm: stm: remove dead code and pointless local lut storage
>   drm: remove unused and redundant callbacks
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c  |  24 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h|   1 -
>  drivers/gpu/drm/amd/amdgpu/dce_v10_0.c  |  27 ++---
>  drivers/gpu/drm/amd/amdgpu/dce_v11_0.c  |  27 ++---
>  drivers/gpu/drm/amd/amdgpu/dce_v6_0.c   |  27 ++---
>  drivers/gpu/drm/amd/amdgpu/dce_v8_0.c   |  27 ++---
>  drivers/gpu/drm/amd/amdgpu/dce_virtual.c|  23 
>  drivers/gpu/drm/armada/armada_crtc.c|  10 --
>  drivers/gpu/drm/armada/armada_crtc.h|   2 -
>  drivers/gpu/drm/armada/armada_fbdev.c   |   2 -
>  drivers/gpu/drm/ast/ast_drv.h   |   1 -
>  drivers/gpu/drm/ast/ast_fb.c|  20 
>  drivers/gpu/drm/ast/ast_mode.c  |  26 +
>  drivers/gpu/drm/cirrus/cirrus_drv.h |   8 --
>  drivers/gpu/drm/cirrus/cirrus_fbdev.c   |   2 -
>  drivers/gpu/drm/cirrus/cirrus_mode.c|  71 +++-
>  drivers/gpu/drm/drm_fb_helper.c | 164 
> +---
>  drivers/gpu/drm/gma500/framebuffer.c|  22 
>  drivers/gpu/drm/gma500/gma_display.c|  32 ++
>  drivers/gpu/drm/gma500/psb_intel_display.c  |   7 +-
>  drivers/gpu/drm/gma500/psb_intel_drv.h  |   1 -
>  drivers/gpu/drm/i915/intel_drv.h|   1 -
>  drivers/gpu/drm/i915/intel_fbdev.c  |  31 --
>  drivers/gpu/drm/mgag200/mgag200_drv.h   |   5 -
>  drivers/gpu/drm/mgag200/mgag200_fb.c|   2 -
>  drivers/gpu/drm/mgag200/mgag200_mode.c  |  62 +++
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c |  26 ++---
>  drivers/gpu/drm/nouveau/nouveau_crtc.h  |   3 -
>  drivers/gpu/drm/nouveau/nouveau_fbcon.c |  22 
>  drivers/gpu/drm/nouveau/nv50_display.c  |  40 +++
>  drivers/gpu/drm/radeon/atombios_crtc.c  |   1 -
>  drivers/gpu/drm/radeon/radeon_connectors.c  |   7 +-
>  drivers/gpu/drm/radeon/radeon_display.c |  71 +---
>  drivers/gpu/drm/radeon/radeon_fb.c  |   2 -
>  drivers/gpu/drm/radeon/radeon_legacy_crtc.c |   1 -
>  drivers/gpu/drm/radeon/radeon_mode.h 

Re: [PATCH v2 03/14] drm/fb-helper: do a generic fb_setcmap helper in terms of crtc .gamma_set

2017-06-26 Thread Daniel Vetter
On Thu, Jun 22, 2017 at 12:22:10PM +0200, Peter Rosin wrote:
> This makes the redundant fb helpers .load_lut, .gamma_set and .gamma_get
> totally obsolete.
> 
> Signed-off-by: Peter Rosin 
> ---
>  drivers/gpu/drm/drm_fb_helper.c | 151 
> +---
>  1 file changed, 63 insertions(+), 88 deletions(-)
> 
> This is an alternative version rebased on top of Daniel's "fbdev helper
> locking rework and deferred setup" series.
> 
> And as noted by Daniel, .gamma_set does an atomic commit. Thus, the locks
> needs to be dropped and reacquired for each crtc. So, that is fixed here
> too. Doing it like this with a couple of individual alternative patches
> instead of sending a whole new series since the dependency on Daniel's
> series makes life somewhat difficult...
> 
> Cheers,
> peda
> 
> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
> index 4aceb59..aa025f1 100644
> --- a/drivers/gpu/drm/drm_fb_helper.c
> +++ b/drivers/gpu/drm/drm_fb_helper.c
> @@ -1257,50 +1257,6 @@ void drm_fb_helper_set_suspend_unlocked(struct 
> drm_fb_helper *fb_helper,
>  }
>  EXPORT_SYMBOL(drm_fb_helper_set_suspend_unlocked);
>  
> -static int setcolreg(struct drm_crtc *crtc, u16 red, u16 green,
> -  u16 blue, u16 regno, struct fb_info *info)
> -{
> - struct drm_fb_helper *fb_helper = info->par;
> - struct drm_framebuffer *fb = fb_helper->fb;
> -
> - if (info->fix.visual == FB_VISUAL_TRUECOLOR) {
> - u32 *palette;
> - u32 value;
> - /* place color in psuedopalette */
> - if (regno > 16)
> - return -EINVAL;
> - palette = (u32 *)info->pseudo_palette;
> - red >>= (16 - info->var.red.length);
> - green >>= (16 - info->var.green.length);
> - blue >>= (16 - info->var.blue.length);
> - value = (red << info->var.red.offset) |
> - (green << info->var.green.offset) |
> - (blue << info->var.blue.offset);
> - if (info->var.transp.length > 0) {
> - u32 mask = (1 << info->var.transp.length) - 1;
> -
> - mask <<= info->var.transp.offset;
> - value |= mask;
> - }
> - palette[regno] = value;
> - return 0;
> - }
> -
> - /*
> -  * The driver really shouldn't advertise pseudo/directcolor
> -  * visuals if it can't deal with the palette.
> -  */
> - if (WARN_ON(!fb_helper->funcs->gamma_set ||
> - !fb_helper->funcs->gamma_get))
> - return -EINVAL;
> -
> - WARN_ON(fb->format->cpp[0] != 1);
> -
> - fb_helper->funcs->gamma_set(crtc, red, green, blue, regno);
> -
> - return 0;
> -}
> -
>  /**
>   * drm_fb_helper_setcmap - implementation for _ops.fb_setcmap
>   * @cmap: cmap to set
> @@ -1310,12 +1266,10 @@ int drm_fb_helper_setcmap(struct fb_cmap *cmap, 
> struct fb_info *info)
>  {
>   struct drm_fb_helper *fb_helper = info->par;
>   struct drm_device *dev = fb_helper->dev;
> - const struct drm_crtc_helper_funcs *crtc_funcs;
> - u16 *red, *green, *blue, *transp;
> + struct drm_modeset_acquire_ctx ctx;
>   struct drm_crtc *crtc;
>   u16 *r, *g, *b;
> - int i, j, rc = 0;
> - int start;
> + int i, ret = 0;
>  
>   if (oops_in_progress)
>   return -EBUSY;
> @@ -1329,61 +1283,82 @@ int drm_fb_helper_setcmap(struct fb_cmap *cmap, 
> struct fb_info *info)
>   return -EBUSY;
>   }
>  
> - drm_modeset_lock_all(dev);
> - for (i = 0; i < fb_helper->crtc_count; i++) {
> - crtc = fb_helper->crtc_info[i].mode_set.crtc;
> - crtc_funcs = crtc->helper_private;
> + drm_modeset_acquire_init(, 0);
>  
> - red = cmap->red;
> - green = cmap->green;
> - blue = cmap->blue;
> - transp = cmap->transp;
> - start = cmap->start;
> + for (i = 0; i < fb_helper->crtc_count; i++) {
> + if (info->fix.visual == FB_VISUAL_TRUECOLOR) {
> + u32 *palette;
> + int j;
>  
> - if (info->fix.visual != FB_VISUAL_TRUECOLOR) {
> - if (!crtc->gamma_size) {
> - rc = -EINVAL;
> - goto out;
> + if (cmap->start + cmap->len > 16) {
> + ret = -EINVAL;
> + break;
>   }
>  
> - if (cmap->start + cmap->len > crtc->gamma_size) {
> - rc = -EINVAL;
> - goto out;
> + palette = (u32 *)info->pseudo_palette;
> + for (j = 0; j < cmap->len; ++j) {
> + u16 red = cmap->red[j];
> + u16 green = cmap->green[j];
> + 

Re: [PATCH v2 01/14] drm/fb-helper: keep the .gamma_store updated in drm_fb_helper_setcmap

2017-06-26 Thread Daniel Vetter
On Thu, Jun 22, 2017 at 08:06:24AM +0200, Peter Rosin wrote:
> I think the gamma_store can end up invalid on error. But the way I read
> it, that can happen in drm_mode_gamma_set_ioctl as well, so why should
> this pesky legacy fbdev stuff be any better?
> 
> Signed-off-by: Peter Rosin 
> ---
>  drivers/gpu/drm/drm_fb_helper.c | 27 +++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
> index 574af01..25315fb 100644
> --- a/drivers/gpu/drm/drm_fb_helper.c
> +++ b/drivers/gpu/drm/drm_fb_helper.c
> @@ -1223,12 +1223,16 @@ int drm_fb_helper_setcmap(struct fb_cmap *cmap, 
> struct fb_info *info)
>   const struct drm_crtc_helper_funcs *crtc_funcs;
>   u16 *red, *green, *blue, *transp;
>   struct drm_crtc *crtc;
> + u16 *r, *g, *b;
>   int i, j, rc = 0;
>   int start;
>  
>   if (oops_in_progress)
>   return -EBUSY;
>  
> + if (cmap->start + cmap->len < cmap->start)
> + return -EINVAL;

Doesn't the fbdev core check this for us?

> +
>   drm_modeset_lock_all(dev);
>   if (!drm_fb_helper_is_bound(fb_helper)) {
>   drm_modeset_unlock_all(dev);
> @@ -1245,6 +1249,29 @@ int drm_fb_helper_setcmap(struct fb_cmap *cmap, struct 
> fb_info *info)
>   transp = cmap->transp;
>   start = cmap->start;
>  
> + if (info->fix.visual != FB_VISUAL_TRUECOLOR) {

I think switching to a positive check of visual == PSEUDOCOLOR is a bit
clearer. Also this makes more sense once we move over to the main
gamma_set function, since as-is it's unclear that setcolreg only does
something for PSEUDOCOLOR. But interim I guess that's ok.
-Daniel

> + if (!crtc->gamma_size) {
> + rc = -EINVAL;
> + goto out;
> + }
> +
> + if (cmap->start + cmap->len > crtc->gamma_size) {
> + rc = -EINVAL;
> + goto out;
> + }
> +
> + r = crtc->gamma_store;
> + g = r + crtc->gamma_size;
> + b = g + crtc->gamma_size;
> +
> + memcpy(r + cmap->start, cmap->red,
> +cmap->len * sizeof(u16));
> + memcpy(g + cmap->start, cmap->green,
> +cmap->len * sizeof(u16));
> + memcpy(b + cmap->start, cmap->blue,
> +cmap->len * sizeof(u16));
> + }
> +
>   for (j = 0; j < cmap->len; j++) {
>   u16 hred, hgreen, hblue, htransp = 0x;
>  
> -- 
> 2.1.4
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 6/9] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS

2017-06-26 Thread Michel Dänzer
On 25/06/17 03:00 AM, Christian König wrote:
> Am 23.06.2017 um 19:39 schrieb John Brooks:
>> When the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag is given by
>> userspace,
>> it should only be treated as a hint to initially place a BO somewhere CPU
>> accessible, rather than having a permanent effect on BO placement.
> 
> And that is a clear NAK from my side.
> 
> CPU_ACCESS_REQUIRED is a permanent limitation to where the buffer should
> be placed.

It really can't be more than a hint. The userspace driver cannot
reliably know ahead of time whether a BO will be accessed by the CPU at
all, let alone how often. A BO which incorrectly has this flag set
creates artificial pressure on CPU visible VRAM.


> Ignoring that and changing the IOCTL interface like this is clearly not
> acceptable.

I'd say it's more adapting the semantics of the flag to reality. :)


>> Instead of the flag being set in stone at BO creation, set the flag
>> when a page fault occurs so that it goes somewhere CPU-visible, and clear it
>> when the BO is requested by the GPU.

My idea was to only clear the flag when the BO is moved from GTT to
VRAM. That way, any BO which was ever accessed by the CPU since it was
last moved to VRAM will be preferably put in the CPU visible part of VRAM.

Not sure which way is better though.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v2 03/14] drm/fb-helper: do a generic fb_setcmap helper in terms of crtc .gamma_set

2017-06-26 Thread Daniel Vetter
On Thu, Jun 22, 2017 at 08:06:26AM +0200, Peter Rosin wrote:
> This makes the redundant fb helpers .load_lut, .gamma_set and .gamma_get
> totally obsolete.
> 
> Signed-off-by: Peter Rosin 
> ---
>  drivers/gpu/drm/drm_fb_helper.c | 154 
> 
>  1 file changed, 63 insertions(+), 91 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
> index 7ade384..58eb045 100644
> --- a/drivers/gpu/drm/drm_fb_helper.c
> +++ b/drivers/gpu/drm/drm_fb_helper.c
> @@ -1150,50 +1150,6 @@ void drm_fb_helper_set_suspend_unlocked(struct 
> drm_fb_helper *fb_helper,
>  }
>  EXPORT_SYMBOL(drm_fb_helper_set_suspend_unlocked);
>  
> -static int setcolreg(struct drm_crtc *crtc, u16 red, u16 green,
> -  u16 blue, u16 regno, struct fb_info *info)
> -{
> - struct drm_fb_helper *fb_helper = info->par;
> - struct drm_framebuffer *fb = fb_helper->fb;
> -
> - if (info->fix.visual == FB_VISUAL_TRUECOLOR) {
> - u32 *palette;
> - u32 value;
> - /* place color in psuedopalette */
> - if (regno > 16)
> - return -EINVAL;
> - palette = (u32 *)info->pseudo_palette;
> - red >>= (16 - info->var.red.length);
> - green >>= (16 - info->var.green.length);
> - blue >>= (16 - info->var.blue.length);
> - value = (red << info->var.red.offset) |
> - (green << info->var.green.offset) |
> - (blue << info->var.blue.offset);
> - if (info->var.transp.length > 0) {
> - u32 mask = (1 << info->var.transp.length) - 1;
> -
> - mask <<= info->var.transp.offset;
> - value |= mask;
> - }
> - palette[regno] = value;
> - return 0;
> - }
> -
> - /*
> -  * The driver really shouldn't advertise pseudo/directcolor
> -  * visuals if it can't deal with the palette.
> -  */
> - if (WARN_ON(!fb_helper->funcs->gamma_set ||
> - !fb_helper->funcs->gamma_get))
> - return -EINVAL;
> -
> - WARN_ON(fb->format->cpp[0] != 1);
> -
> - fb_helper->funcs->gamma_set(crtc, red, green, blue, regno);
> -
> - return 0;
> -}
> -
>  /**
>   * drm_fb_helper_setcmap - implementation for _ops.fb_setcmap
>   * @cmap: cmap to set
> @@ -1203,12 +1159,10 @@ int drm_fb_helper_setcmap(struct fb_cmap *cmap, 
> struct fb_info *info)
>  {
>   struct drm_fb_helper *fb_helper = info->par;
>   struct drm_device *dev = fb_helper->dev;
> - const struct drm_crtc_helper_funcs *crtc_funcs;
> - u16 *red, *green, *blue, *transp;
> + struct drm_modeset_acquire_ctx ctx;
>   struct drm_crtc *crtc;
>   u16 *r, *g, *b;
> - int i, j, rc = 0;
> - int start;
> + int i, ret;
>  
>   if (oops_in_progress)
>   return -EBUSY;
> @@ -1216,65 +1170,83 @@ int drm_fb_helper_setcmap(struct fb_cmap *cmap, 
> struct fb_info *info)
>   if (cmap->start + cmap->len < cmap->start)
>   return -EINVAL;
>  
> - drm_modeset_lock_all(dev);
> + drm_modeset_acquire_init(, 0);
> +retry:
> + ret = drm_modeset_lock_all_ctx(dev, );
> + if (ret)
> + goto out;
>   if (!drm_fb_helper_is_bound(fb_helper)) {
> - drm_modeset_unlock_all(dev);
> - return -EBUSY;
> + ret = -EBUSY;
> + goto out;
>   }
>  
>   for (i = 0; i < fb_helper->crtc_count; i++) {
> - crtc = fb_helper->crtc_info[i].mode_set.crtc;
> - crtc_funcs = crtc->helper_private;
> -
> - red = cmap->red;
> - green = cmap->green;
> - blue = cmap->blue;
> - transp = cmap->transp;
> - start = cmap->start;
> + if (info->fix.visual == FB_VISUAL_TRUECOLOR) {
> + u32 *palette;
> + int j;
>  
> - if (info->fix.visual != FB_VISUAL_TRUECOLOR) {
> - if (!crtc->gamma_size) {
> - rc = -EINVAL;
> + if (cmap->start + cmap->len > 16) {
> + ret = -EINVAL;
>   goto out;
>   }
>  
> - if (cmap->start + cmap->len > crtc->gamma_size) {
> - rc = -EINVAL;
> - goto out;
> + palette = (u32 *)info->pseudo_palette;
> + for (j = 0; j < cmap->len; ++j) {
> + u16 red = cmap->red[j];
> + u16 green = cmap->green[j];
> + u16 blue = cmap->blue[j];
> + u32 value;
> +
> + red >>= 16 - info->var.red.length;
> + green >>= 16 - info->var.green.length;
> +  

Re: [PATCH v2 02/14] drm/fb-helper: remove drm_fb_helper_save_lut_atomic

2017-06-26 Thread Daniel Vetter
On Thu, Jun 22, 2017 at 08:06:25AM +0200, Peter Rosin wrote:
> drm_fb_helper_save_lut_atomic is redundant since the .gamma_store is
> now always kept up to date by drm_fb_helper_setcmap.
> 
> Signed-off-by: Peter Rosin 

Also note that this is for kgdb support only and so likely very buggy
(since no one cried when we started to break kgdb support when switching
drivers to atomic).
-Daniel

> ---
>  drivers/gpu/drm/drm_fb_helper.c | 17 -
>  1 file changed, 17 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
> index 25315fb..7ade384 100644
> --- a/drivers/gpu/drm/drm_fb_helper.c
> +++ b/drivers/gpu/drm/drm_fb_helper.c
> @@ -229,22 +229,6 @@ int drm_fb_helper_remove_one_connector(struct 
> drm_fb_helper *fb_helper,
>  }
>  EXPORT_SYMBOL(drm_fb_helper_remove_one_connector);
>  
> -static void drm_fb_helper_save_lut_atomic(struct drm_crtc *crtc, struct 
> drm_fb_helper *helper)
> -{
> - uint16_t *r_base, *g_base, *b_base;
> - int i;
> -
> - if (helper->funcs->gamma_get == NULL)
> - return;
> -
> - r_base = crtc->gamma_store;
> - g_base = r_base + crtc->gamma_size;
> - b_base = g_base + crtc->gamma_size;
> -
> - for (i = 0; i < crtc->gamma_size; i++)
> - helper->funcs->gamma_get(crtc, _base[i], _base[i], 
> _base[i], i);
> -}
> -
>  static void drm_fb_helper_restore_lut_atomic(struct drm_crtc *crtc)
>  {
>   uint16_t *r_base, *g_base, *b_base;
> @@ -285,7 +269,6 @@ int drm_fb_helper_debug_enter(struct fb_info *info)
>   if (drm_drv_uses_atomic_modeset(mode_set->crtc->dev))
>   continue;
>  
> - drm_fb_helper_save_lut_atomic(mode_set->crtc, helper);
>   funcs->mode_set_base_atomic(mode_set->crtc,
>   mode_set->fb,
>   mode_set->x,
> -- 
> 2.1.4
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-26 Thread zhoucm1



On 2017年06月26日 03:48, Dave Airlie wrote:

Do you not use bo list at all in mesa? radv as well?

Currently radv is creating a bo list per command submission. radv does
not use an offload thread to do command submission, as it seems pretty
un-vulkan to use a thread for the queue submission thread the game
uses.

I have considered investigating this, but with vulkan it's probably
optimising for the single threaded case which isn't where apps should
really be.

At the moment we regenerate the bo list on every CS ioctl, we probably
can do a lot better, I know Bas has looked into this area a bit more
than I.

Thanks Dave for inputting.

Could I ask more about radv? How does radv make bo list for every cs 
ioctl? Adding filter in every operation, any related bo will be add to 
bo list during make command submission?


Thanks,
David Zhou


Dave.


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-26 Thread Michel Dänzer
On 23/06/17 07:49 PM, Marek Olšák wrote:
> On Fri, Jun 23, 2017 at 11:27 AM, Christian König
>  wrote:
>> Am 23.06.2017 um 11:08 schrieb zhoucm1:
>>> On 2017年06月23日 17:01, zhoucm1 wrote:
 On 2017年06月23日 16:25, Christian König wrote:
> Am 23.06.2017 um 09:09 schrieb zhoucm1:
>> On 2017年06月23日 14:57, Christian König wrote:
>>>
>>> But giving the CS IOCTL an option for directly specifying the BOs
>>> instead of a BO list like Marek suggested would indeed save us some time
>>> here.
>>
>> interesting, I always follow how to improve our cs ioctl, since UMD
>> guys aften complain our command submission is slower than windows.
>> Then how to directly specifying the BOs instead of a BO list? BO handle
>> array from UMD? Could your guys describe more clear? Is it doable?
>
>
> Making the BO list part of the CS IOCTL wouldn't help at all for the
> close source UMDs. To be precise we actually came up with the BO list
> approach because of their requirement.
>
> The biggest bunch of work during CS is reserving all the buffers,
> validating them and checking their VM status.

 Totally agree. Every time when I read code there, I often want to
 optimize them.

> It doesn't matter if the BOs come from the BO list or directly in the CS
> IOCTL.
>
> The key point is that CS overhead is pretty much irrelevant for the open
> source stack, since Mesa does command submission from a separate thread
> anyway.

 If irrelevant for the open stack, then how does open source stack handle
 "The biggest bunch of work during CS is reserving all the buffers,
 validating them and checking their VM status."?
>>
>>
>> Command submission on the open stack is outsourced to a separate user space
>> thread. E.g. when an application triggers a flush the IBs created so far are
>> just put on a queue and another thread pushes them down to the kernel.
>>
>> I mean reducing the overhead of the CS IOCTL is always nice, but you usual
>> won't see any fps increase as long as not all CPUs are completely bound to
>> some tasks.
>>
 If open stack has a better way, I think closed stack can follow it, I
 don't know the history.
>>>
>>> Do you not use bo list at all in mesa? radv as well?
>>
>>
>> I don't think so. Mesa just wants to send the list of used BOs down to the
>> kernel with every IOCTL.
> 
> The CS ioctl actually costs us some performance, but not as much as on
> closed source drivers.
> 
> MesaGL always executes all CS ioctls in a separate thread (in parallel
> with the UMD) except for the last IB that's submitted by SwapBuffers.

... or by an explicit glFinish or glFlush (at least when the current
draw buffer isn't a back buffer) call, right?


> For us, it's certainly useful to optimize the CS ioctl because of apps
> that submit only 1 IB per frame where multithreading has no effect or
> may even hurt performance.

Another possibility might be flushing earlier, e.g. when the GPU and/or
CS submission thread are idle. But optimizing the CS ioctl would still
help in that case.

Finding good heuristics which allows better utilization of the GPU / CS
submission thread and doesn't hurt performance in any scenario might be
tricky though.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: fix vulkan test performance drop and hang on VI

2017-06-26 Thread Rex Zhu
caused by not program dynamic_cu_mask_addr in the KIQ MQD.

v2: create struct vi_mqd_allocation in FB which will contain
1. PM4 MQD structure.
2. Write Pointer Poll Memory.
3. Read Pointer Report Memory
4. Dynamic CU Mask.
5. Dynamic RB Mask.

Change-Id: I22c840f1bf8d365f7df33a27d6b11e1aea8f2958
Signed-off-by: Rex Zhu 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c|  27 ++--
 drivers/gpu/drm/amd/include/vi_structs.h | 268 +++
 2 files changed, 285 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 1a75ab1..452cc5b 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -40,7 +40,6 @@
 
 #include "bif/bif_5_0_d.h"
 #include "bif/bif_5_0_sh_mask.h"
-
 #include "gca/gfx_8_0_d.h"
 #include "gca/gfx_8_0_enum.h"
 #include "gca/gfx_8_0_sh_mask.h"
@@ -2100,7 +2099,7 @@ static int gfx_v8_0_sw_init(void *handle)
return r;
 
/* create MQD for all compute queues as well as KIQ for SRIOV case */
-   r = amdgpu_gfx_compute_mqd_sw_init(adev, sizeof(struct vi_mqd));
+   r = amdgpu_gfx_compute_mqd_sw_init(adev, sizeof(struct 
vi_mqd_allocation));
if (r)
return r;
 
@@ -4715,9 +4714,6 @@ static int gfx_v8_0_mqd_init(struct amdgpu_ring *ring)
uint64_t hqd_gpu_addr, wb_gpu_addr, eop_base_addr;
uint32_t tmp;
 
-   /* init the mqd struct */
-   memset(mqd, 0, sizeof(struct vi_mqd));
-
mqd->header = 0xC0310800;
mqd->compute_pipelinestat_enable = 0x0001;
mqd->compute_static_thread_mgmt_se0 = 0x;
@@ -4725,7 +4721,12 @@ static int gfx_v8_0_mqd_init(struct amdgpu_ring *ring)
mqd->compute_static_thread_mgmt_se2 = 0x;
mqd->compute_static_thread_mgmt_se3 = 0x;
mqd->compute_misc_reserved = 0x0003;
-
+   if (!(adev->flags & AMD_IS_APU)) {
+   mqd->dynamic_cu_mask_addr_lo = lower_32_bits(ring->mqd_gpu_addr
++ offsetof(struct 
vi_mqd_allocation, dyamic_cu_mask));
+   mqd->dynamic_cu_mask_addr_hi = upper_32_bits(ring->mqd_gpu_addr
++ offsetof(struct 
vi_mqd_allocation, dyamic_cu_mask));
+   }
eop_base_addr = ring->eop_gpu_addr >> 8;
mqd->cp_hqd_eop_base_addr_lo = eop_base_addr;
mqd->cp_hqd_eop_base_addr_hi = upper_32_bits(eop_base_addr);
@@ -4900,7 +4901,7 @@ static int gfx_v8_0_kiq_init_queue(struct amdgpu_ring 
*ring)
if (adev->gfx.in_reset) { /* for GPU_RESET case */
/* reset MQD to a clean status */
if (adev->gfx.mec.mqd_backup[mqd_idx])
-   memcpy(mqd, adev->gfx.mec.mqd_backup[mqd_idx], 
sizeof(*mqd));
+   memcpy(mqd, adev->gfx.mec.mqd_backup[mqd_idx], 
sizeof(struct vi_mqd_allocation));
 
/* reset ring buffer */
ring->wptr = 0;
@@ -4916,6 +4917,9 @@ static int gfx_v8_0_kiq_init_queue(struct amdgpu_ring 
*ring)
vi_srbm_select(adev, 0, 0, 0, 0);
mutex_unlock(>srbm_mutex);
} else {
+   memset((void *)mqd, 0, sizeof(struct vi_mqd_allocation));
+   ((struct vi_mqd_allocation *)mqd)->dyamic_cu_mask = 0x;
+   ((struct vi_mqd_allocation *)mqd)->dyamic_rb_mask = 0x;
mutex_lock(>srbm_mutex);
vi_srbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
gfx_v8_0_mqd_init(ring);
@@ -4929,7 +4933,7 @@ static int gfx_v8_0_kiq_init_queue(struct amdgpu_ring 
*ring)
mutex_unlock(>srbm_mutex);
 
if (adev->gfx.mec.mqd_backup[mqd_idx])
-   memcpy(adev->gfx.mec.mqd_backup[mqd_idx], mqd, 
sizeof(*mqd));
+   memcpy(adev->gfx.mec.mqd_backup[mqd_idx], mqd, 
sizeof(struct vi_mqd_allocation));
}
 
return r;
@@ -4947,6 +4951,9 @@ static int gfx_v8_0_kcq_init_queue(struct amdgpu_ring 
*ring)
int mqd_idx = ring - >gfx.compute_ring[0];
 
if (!adev->gfx.in_reset && !adev->gfx.in_suspend) {
+   memset((void *)mqd, 0, sizeof(struct vi_mqd_allocation));
+   ((struct vi_mqd_allocation *)mqd)->dyamic_cu_mask = 0x;
+   ((struct vi_mqd_allocation *)mqd)->dyamic_rb_mask = 0x;
mutex_lock(>srbm_mutex);
vi_srbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
gfx_v8_0_mqd_init(ring);
@@ -4954,11 +4961,11 @@ static int gfx_v8_0_kcq_init_queue(struct amdgpu_ring 
*ring)
mutex_unlock(>srbm_mutex);
 
if (adev->gfx.mec.mqd_backup[mqd_idx])
-   memcpy(adev->gfx.mec.mqd_backup[mqd_idx], mqd, 
sizeof(*mqd));
+   memcpy(adev->gfx.mec.mqd_backup[mqd_idx], mqd, 
sizeof(struct vi_mqd_allocation));
} 

Re: [PATCH] drm/amdgpu: fix solid screen and hard hang when 2k + 4k display connected with DP

2017-06-26 Thread Christian König

Am 26.06.2017 um 09:03 schrieb Huang Rui:

Signed-off-by: Huang Rui 
Cc: Xiaojie Yuan 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 5bed483..60d0455 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -130,7 +130,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
ret = psp_cmd_submit(psp, ucode, cmd_buf_mc_addr,
 fence_mc_addr, index);
  
-	while (*((unsigned int *)psp->fence_buf) != index) {

+   while (*((unsigned int *)psp->fence_buf) < index) {


That is still horrible incorrect. At bare minimum you need to use the 
READ_ONCE() macro here.


But it would be even better to completely drop this and use the fence 
implementation in amdgpu_fence.c as well.


Christian.


msleep(1);
}
  



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH libdrm] amdgpu: update the exported always on CU bitmap

2017-06-26 Thread Christian König

Am 26.06.2017 um 08:12 schrieb Flora Cui:

keep cu_ao_mask unchanged for backward compatibility.

Change-Id: I9f497aadd309977468e246fea333b392c0150276
Signed-off-by: Flora Cui 
---
This patch should be landed after the kmd patch upsteam. right?


In general yes, but this patch is a NAK as well.


  amdgpu/amdgpu.h  | 2 ++
  amdgpu/amdgpu_gpu_info.c | 1 +
  include/drm/amdgpu_drm.h | 3 +++
  3 files changed, 6 insertions(+)

diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
index b6779f9..cc80493 100644
--- a/amdgpu/amdgpu.h
+++ b/amdgpu/amdgpu.h
@@ -486,7 +486,9 @@ struct amdgpu_gpu_info {
uint32_t pa_sc_raster_cfg1[4];
/* CU info */
uint32_t cu_active_number;
+   /* NOTE: cu_ao_mask is INVALID, DON'T use it */
uint32_t cu_ao_mask;
+   uint32_t cu_ao_bitmap[4][4];
uint32_t cu_bitmap[4][4];


You can't change the amdgpu_gpu_info structure here for the same reasons 
you can't change the kernel interface.


Even worse in oposite to the kernel interface this structure isn't 
versionized, so you can't change it at all.


I suggest that we deprecate amdgpu_gpu_info and clients just call the 
appropriate info query directly.


Regards,
Christian.


/* video memory type info*/
uint32_t vram_type;
diff --git a/amdgpu/amdgpu_gpu_info.c b/amdgpu/amdgpu_gpu_info.c
index 34f77be..7a5feb9 100644
--- a/amdgpu/amdgpu_gpu_info.c
+++ b/amdgpu/amdgpu_gpu_info.c
@@ -230,6 +230,7 @@ drm_private int 
amdgpu_query_gpu_info_init(amdgpu_device_handle dev)
  
  	dev->info.cu_active_number = dev->dev_info.cu_active_number;

dev->info.cu_ao_mask = dev->dev_info.cu_ao_mask;
+   memcpy(>info.cu_ao_bitmap[0][0], >dev_info.cu_ao_bitmap[0][0], 
sizeof(dev->info.cu_ao_bitmap));
memcpy(>info.cu_bitmap[0][0], >dev_info.cu_bitmap[0][0], 
sizeof(dev->info.cu_bitmap));
  
  	/* TODO: info->max_quad_shader_pipes is not set */

diff --git a/include/drm/amdgpu_drm.h b/include/drm/amdgpu_drm.h
index df250de..05c4e72 100644
--- a/include/drm/amdgpu_drm.h
+++ b/include/drm/amdgpu_drm.h
@@ -832,6 +832,7 @@ struct drm_amdgpu_info_device {
__u64 max_memory_clock;
/* cu information */
__u32 cu_active_number;
+   /* NOTE: cu_ao_mask is INVALID, DON'T use it */
__u32 cu_ao_mask;
__u32 cu_bitmap[4][4];
/** Render backend pipe mask. One render backend is CB+DB. */
@@ -886,6 +887,8 @@ struct drm_amdgpu_info_device {
/* max gs wavefront per vgt*/
__u32 max_gs_waves_per_vgt;
__u32 _pad1;
+   /* always on cu bitmap */
+   __u32 cu_ao_bitmap[4][4];
  };
  
  struct drm_amdgpu_info_hw_ip {



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: fix solid screen and hard hang when 2k + 4k display connected with DP

2017-06-26 Thread Huang Rui
Signed-off-by: Huang Rui 
Cc: Xiaojie Yuan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 5bed483..60d0455 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -130,7 +130,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
ret = psp_cmd_submit(psp, ucode, cmd_buf_mc_addr,
 fence_mc_addr, index);
 
-   while (*((unsigned int *)psp->fence_buf) != index) {
+   while (*((unsigned int *)psp->fence_buf) < index) {
msleep(1);
}
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH libdrm] amdgpu: update the exported always on CU bitmap

2017-06-26 Thread Michel Dänzer
On 26/06/17 03:12 PM, Flora Cui wrote:
> keep cu_ao_mask unchanged for backward compatibility.
> 
> Change-Id: I9f497aadd309977468e246fea333b392c0150276
> Signed-off-by: Flora Cui 
> ---
> This patch should be landed after the kmd patch upsteam. right?

Right. Also, include/drm/amdgpu_drm.h should be updated in a separate
patch, see  include/drm/README (in particular the "When and how to
update these files").


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH libdrm] amdgpu: update the exported always on CU bitmap

2017-06-26 Thread Flora Cui
keep cu_ao_mask unchanged for backward compatibility.

Change-Id: I9f497aadd309977468e246fea333b392c0150276
Signed-off-by: Flora Cui 
---
This patch should be landed after the kmd patch upsteam. right?
 amdgpu/amdgpu.h  | 2 ++
 amdgpu/amdgpu_gpu_info.c | 1 +
 include/drm/amdgpu_drm.h | 3 +++
 3 files changed, 6 insertions(+)

diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
index b6779f9..cc80493 100644
--- a/amdgpu/amdgpu.h
+++ b/amdgpu/amdgpu.h
@@ -486,7 +486,9 @@ struct amdgpu_gpu_info {
uint32_t pa_sc_raster_cfg1[4];
/* CU info */
uint32_t cu_active_number;
+   /* NOTE: cu_ao_mask is INVALID, DON'T use it */
uint32_t cu_ao_mask;
+   uint32_t cu_ao_bitmap[4][4];
uint32_t cu_bitmap[4][4];
/* video memory type info*/
uint32_t vram_type;
diff --git a/amdgpu/amdgpu_gpu_info.c b/amdgpu/amdgpu_gpu_info.c
index 34f77be..7a5feb9 100644
--- a/amdgpu/amdgpu_gpu_info.c
+++ b/amdgpu/amdgpu_gpu_info.c
@@ -230,6 +230,7 @@ drm_private int 
amdgpu_query_gpu_info_init(amdgpu_device_handle dev)
 
dev->info.cu_active_number = dev->dev_info.cu_active_number;
dev->info.cu_ao_mask = dev->dev_info.cu_ao_mask;
+   memcpy(>info.cu_ao_bitmap[0][0], 
>dev_info.cu_ao_bitmap[0][0], sizeof(dev->info.cu_ao_bitmap));
memcpy(>info.cu_bitmap[0][0], >dev_info.cu_bitmap[0][0], 
sizeof(dev->info.cu_bitmap));
 
/* TODO: info->max_quad_shader_pipes is not set */
diff --git a/include/drm/amdgpu_drm.h b/include/drm/amdgpu_drm.h
index df250de..05c4e72 100644
--- a/include/drm/amdgpu_drm.h
+++ b/include/drm/amdgpu_drm.h
@@ -832,6 +832,7 @@ struct drm_amdgpu_info_device {
__u64 max_memory_clock;
/* cu information */
__u32 cu_active_number;
+   /* NOTE: cu_ao_mask is INVALID, DON'T use it */
__u32 cu_ao_mask;
__u32 cu_bitmap[4][4];
/** Render backend pipe mask. One render backend is CB+DB. */
@@ -886,6 +887,8 @@ struct drm_amdgpu_info_device {
/* max gs wavefront per vgt*/
__u32 max_gs_waves_per_vgt;
__u32 _pad1;
+   /* always on cu bitmap */
+   __u32 cu_ao_bitmap[4][4];
 };
 
 struct drm_amdgpu_info_hw_ip {
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx