Re: [PATCH] drm/amdgpu: Attach exclusive fence to prime exported bo's. (v3)

2016-11-06 Thread Christian König

Am 07.11.2016 um 04:29 schrieb Michel Dänzer:

On 07/11/16 11:47 AM, Mario Kleiner wrote:

External clients which import our bo's wait only
for exclusive dmabuf-fences, not on shared ones,
so attach fences on exported buffers as exclusive
ones, if the buffers get imported by an external
client.

See discussion in thread:
https://lists.freedesktop.org/archives/dri-devel/2016-October/122370.html

Tested on Intel iGPU + AMD Tonga dGPU as DRI3/Present
Prime render offload, and with the Tonga standalone as
primary gpu.

v2: Add a wait for all shared fences before prime export,
 as suggested by Christian Koenig.

v3: - Mark buffer prime_exported in amdgpu_gem_prime_pin,
 so we only use the exclusive fence when exporting a
 bo to external clients like a separate iGPU, but not
 when exporting/importing from/to ourselves as part of
 regular DRI3 fd passing.

 - Propagate failure of reservation_object_wait_rcu back
 to caller.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95472
(v1) Tested-by: Mike Lothian 

FWIW, I think the (v1) is usually added at the end of the line, not at
the beginning.

[...]


diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
index 7700dc2..c4494d0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
@@ -81,6 +81,20 @@ int amdgpu_gem_prime_pin(struct drm_gem_object *obj)
  {
struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
int ret = 0;
+   long lret;
+
+   /*
+* Wait for all shared fences to complete before we switch to future
+* use of exclusive fence on this prime_exported bo.
+*/
+   lret = reservation_object_wait_timeout_rcu(bo->tbo.resv, true, false,
+  MAX_SCHEDULE_TIMEOUT);
+   if (unlikely(lret < 0)) {
+   DRM_DEBUG_PRIME("Fence wait failed: %li\n", lret);
+   return lret;
+   }
+
+   bo->prime_exported = true;

We should probably clear bo->prime_exported in amdgpu_gem_prime_unpin.

Also, I think we should set bo->prime_exported (prime_shared?) in
amdgpu_gem_prime_import_sg_table as well, so that we'll also set
exclusive fences on BOs imported from other GPUs.

Yes, really good idea.

Additional to that are we sure that amdgpu_gem_prime_pin() is only 
called once for each GEM object? Could in theory be that somebody 
exports a BO to another GFX device as well as a V4L device for example.


In this case we would need a counter, but I need to double check that as 
well.


In general I would protected the flag/counter by the BO being reserved 
to avoid any races. In other word move those lines a bit down after the 
amdgpu_bo_reserve().


Regards,
Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


amdgpu: ratelimit on vm faults to avoid spaming console

2016-11-06 Thread Edward O'Callaghan
These are rather minor however should help stop some folks
machines grinding to a halt when a userspace application somehow
gets the GPU into some horrible state causing the console to fill
very quickly. Applies on top of master.

Please kindly review,

Edward O'Callaghan (2):
 [PATCH v2 1/2] amdgpu: Use dev_err() over vanilla printk() in
 [PATCH v2 2/2] amdgpu: Wrap dev_err() calls on vm faults with
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v2 1/2] amdgpu: Use dev_err() over vanilla printk() in vm_decode_fault()

2016-11-06 Thread Edward O'Callaghan
Signed-off-by: Edward O'Callaghan 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index aa0c4b9..a0df431 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -711,7 +711,7 @@ static void gmc_v7_0_vm_decode_fault(struct amdgpu_device 
*adev,
mc_id = REG_GET_FIELD(status, VM_CONTEXT1_PROTECTION_FAULT_STATUS,
  MEMORY_CLIENT_ID);
 
-   printk("VM fault (0x%02x, vmid %d) at page %u, %s from '%s' (0x%08x) 
(%d)\n",
+   dev_err(adev->dev, "VM fault (0x%02x, vmid %d) at page %u, %s from '%s' 
(0x%08x) (%d)\n",
   protections, vmid, addr,
   REG_GET_FIELD(status, VM_CONTEXT1_PROTECTION_FAULT_STATUS,
 MEMORY_CLIENT_RW) ?
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index a16b220..7285294 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -837,7 +837,7 @@ static void gmc_v8_0_vm_decode_fault(struct amdgpu_device 
*adev,
mc_id = REG_GET_FIELD(status, VM_CONTEXT1_PROTECTION_FAULT_STATUS,
  MEMORY_CLIENT_ID);
 
-   printk("VM fault (0x%02x, vmid %d) at page %u, %s from '%s' (0x%08x) 
(%d)\n",
+   dev_err(adev->dev, "VM fault (0x%02x, vmid %d) at page %u, %s from '%s' 
(0x%08x) (%d)\n",
   protections, vmid, addr,
   REG_GET_FIELD(status, VM_CONTEXT1_PROTECTION_FAULT_STATUS,
 MEMORY_CLIENT_RW) ?
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v2 2/2] amdgpu: Wrap dev_err() calls on vm faults with printk_ratelimit()

2016-11-06 Thread Edward O'Callaghan
It can be the case that upon GPU page faults we start trashing
the logs, and so let us ratelimit here to avoid that.

V2. Fix issue where calling dev_err_ratelimited separately for
each line means that some lines corresponding to a single
VM fault may or may not appear depending on the rate.
- Michel Dänzer.

Signed-off-by: Edward O'Callaghan 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 16 +---
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 16 +---
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 16 +---
 3 files changed, 27 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index b13c8aa..1721c4a 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -1007,13 +1007,15 @@ static int gmc_v6_0_process_interrupt(struct 
amdgpu_device *adev,
if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST)
gmc_v6_0_set_fault_enable_default(adev, false);
 
-   dev_err(adev->dev, "GPU fault detected: %d 0x%08x\n",
-   entry->src_id, entry->src_data);
-   dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x%08X\n",
-   addr);
-   dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x%08X\n",
-   status);
-   gmc_v6_0_vm_decode_fault(adev, status, addr, 0);
+   if (printk_ratelimit()) {
+   dev_err(adev->dev, "GPU fault detected: %d 0x%08x\n",
+   entry->src_id, entry->src_data);
+   dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_ADDR   
0x%08X\n",
+   addr);
+   dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_STATUS 
0x%08X\n",
+   status);
+   gmc_v6_0_vm_decode_fault(adev, status, addr, 0);
+   }
 
return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index a0df431..940857e 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -1198,13 +1198,15 @@ static int gmc_v7_0_process_interrupt(struct 
amdgpu_device *adev,
if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST)
gmc_v7_0_set_fault_enable_default(adev, false);
 
-   dev_err(adev->dev, "GPU fault detected: %d 0x%08x\n",
-   entry->src_id, entry->src_data);
-   dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x%08X\n",
-   addr);
-   dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x%08X\n",
-   status);
-   gmc_v7_0_vm_decode_fault(adev, status, addr, mc_client);
+   if (printk_ratelimit()) {
+   dev_err(adev->dev, "GPU fault detected: %d 0x%08x\n",
+   entry->src_id, entry->src_data);
+   dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_ADDR   
0x%08X\n",
+   addr);
+   dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_STATUS 
0x%08X\n",
+   status);
+   gmc_v7_0_vm_decode_fault(adev, status, addr, mc_client);
+   }
 
return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 7285294..77690f8 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -1242,13 +1242,15 @@ static int gmc_v8_0_process_interrupt(struct 
amdgpu_device *adev,
if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST)
gmc_v8_0_set_fault_enable_default(adev, false);
 
-   dev_err(adev->dev, "GPU fault detected: %d 0x%08x\n",
-   entry->src_id, entry->src_data);
-   dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x%08X\n",
-   addr);
-   dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x%08X\n",
-   status);
-   gmc_v8_0_vm_decode_fault(adev, status, addr, mc_client);
+   if (printk_ratelimit()) {
+   dev_err(adev->dev, "GPU fault detected: %d 0x%08x\n",
+   entry->src_id, entry->src_data);
+   dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_ADDR   
0x%08X\n",
+   addr);
+   dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_STATUS 
0x%08X\n",
+   status);
+   gmc_v8_0_vm_decode_fault(adev, status, addr, mc_client);
+   }
 
return 0;
 }
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] amdgpu: Use 'dev_err_ratelimited()' over 'dev_err()' on vm faults

2016-11-06 Thread Edward O'Callaghan


On 11/07/2016 02:55 PM, Michel Dänzer wrote:
> On 07/11/16 12:47 PM, Edward O'Callaghan wrote:
>> It can be the case that upon GPU page faults we start trashing
>> the logs, and so let us ratelimit here to avoid that.
>>
>> Signed-off-by: Edward O'Callaghan 
>> ---
>>  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 8 
>>  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 8 
>>  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 8 
>>  3 files changed, 12 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
>> index b13c8aa..79af3f2 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
>> @@ -549,7 +549,7 @@ static void gmc_v6_0_vm_decode_fault(struct 
>> amdgpu_device *adev,
>>  mc_id = REG_GET_FIELD(status, mmVM_CONTEXT1_PROTECTION_FAULT_STATUS,
>>xxMEMORY_CLIENT_ID);
>>  
>> -dev_err(adev->dev, "VM fault (0x%02x, vmid %d) at page %u, %s from '%s' 
>> (0x%08x) (%d)\n",
>> +dev_err_ratelimited(adev->dev, "VM fault (0x%02x, vmid %d) at page %u, 
>> %s from '%s' (0x%08x) (%d)\n",
>> protections, vmid, addr,
>> REG_GET_FIELD(status, mmVM_CONTEXT1_PROTECTION_FAULT_STATUS,
>>   xxMEMORY_CLIENT_RW) ?
>> @@ -1007,11 +1007,11 @@ static int gmc_v6_0_process_interrupt(struct 
>> amdgpu_device *adev,
>>  if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST)
>>  gmc_v6_0_set_fault_enable_default(adev, false);
>>  
>> -dev_err(adev->dev, "GPU fault detected: %d 0x%08x\n",
>> +dev_err_ratelimited(adev->dev, "GPU fault detected: %d 0x%08x\n",
>>  entry->src_id, entry->src_data);
>> -dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x%08X\n",
>> +dev_err_ratelimited(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_ADDR   
>> 0x%08X\n",
>>  addr);
>> -dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x%08X\n",
>> +dev_err_ratelimited(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_STATUS 
>> 0x%08X\n",
>>  status);
>>  gmc_v6_0_vm_decode_fault(adev, status, addr, 0);
> 
> If calling dev_err_ratelimited separately for each line means that some
> lines corresponding to a single VM fault may or may not appear depending
> on the rate, it'd be better to use a single call per VM fault.
Ah ok, yes I was not 100% that was definitely the case - easy to fix, I
will resend a update to address it. Thanks, Ed.

> 
> 



signature.asc
Description: OpenPGP digital signature
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] amdgpu: Use 'dev_err_ratelimited()' over 'dev_err()' on vm faults

2016-11-06 Thread Michel Dänzer
On 07/11/16 12:47 PM, Edward O'Callaghan wrote:
> It can be the case that upon GPU page faults we start trashing
> the logs, and so let us ratelimit here to avoid that.
> 
> Signed-off-by: Edward O'Callaghan 
> ---
>  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 8 
>  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 8 
>  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 8 
>  3 files changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> index b13c8aa..79af3f2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> @@ -549,7 +549,7 @@ static void gmc_v6_0_vm_decode_fault(struct amdgpu_device 
> *adev,
>   mc_id = REG_GET_FIELD(status, mmVM_CONTEXT1_PROTECTION_FAULT_STATUS,
> xxMEMORY_CLIENT_ID);
>  
> - dev_err(adev->dev, "VM fault (0x%02x, vmid %d) at page %u, %s from '%s' 
> (0x%08x) (%d)\n",
> + dev_err_ratelimited(adev->dev, "VM fault (0x%02x, vmid %d) at page %u, 
> %s from '%s' (0x%08x) (%d)\n",
>  protections, vmid, addr,
>  REG_GET_FIELD(status, mmVM_CONTEXT1_PROTECTION_FAULT_STATUS,
>xxMEMORY_CLIENT_RW) ?
> @@ -1007,11 +1007,11 @@ static int gmc_v6_0_process_interrupt(struct 
> amdgpu_device *adev,
>   if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST)
>   gmc_v6_0_set_fault_enable_default(adev, false);
>  
> - dev_err(adev->dev, "GPU fault detected: %d 0x%08x\n",
> + dev_err_ratelimited(adev->dev, "GPU fault detected: %d 0x%08x\n",
>   entry->src_id, entry->src_data);
> - dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x%08X\n",
> + dev_err_ratelimited(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_ADDR   
> 0x%08X\n",
>   addr);
> - dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x%08X\n",
> + dev_err_ratelimited(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_STATUS 
> 0x%08X\n",
>   status);
>   gmc_v6_0_vm_decode_fault(adev, status, addr, 0);

If calling dev_err_ratelimited separately for each line means that some
lines corresponding to a single VM fault may or may not appear depending
on the rate, it'd be better to use a single call per VM fault.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/2] amdgpu: Use dev_err() over vanilla printk() in vm_decode_fault()

2016-11-06 Thread Edward O'Callaghan
Signed-off-by: Edward O'Callaghan 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index aa0c4b9..a0df431 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -711,7 +711,7 @@ static void gmc_v7_0_vm_decode_fault(struct amdgpu_device 
*adev,
mc_id = REG_GET_FIELD(status, VM_CONTEXT1_PROTECTION_FAULT_STATUS,
  MEMORY_CLIENT_ID);
 
-   printk("VM fault (0x%02x, vmid %d) at page %u, %s from '%s' (0x%08x) 
(%d)\n",
+   dev_err(adev->dev, "VM fault (0x%02x, vmid %d) at page %u, %s from '%s' 
(0x%08x) (%d)\n",
   protections, vmid, addr,
   REG_GET_FIELD(status, VM_CONTEXT1_PROTECTION_FAULT_STATUS,
 MEMORY_CLIENT_RW) ?
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index a16b220..7285294 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -837,7 +837,7 @@ static void gmc_v8_0_vm_decode_fault(struct amdgpu_device 
*adev,
mc_id = REG_GET_FIELD(status, VM_CONTEXT1_PROTECTION_FAULT_STATUS,
  MEMORY_CLIENT_ID);
 
-   printk("VM fault (0x%02x, vmid %d) at page %u, %s from '%s' (0x%08x) 
(%d)\n",
+   dev_err(adev->dev, "VM fault (0x%02x, vmid %d) at page %u, %s from '%s' 
(0x%08x) (%d)\n",
   protections, vmid, addr,
   REG_GET_FIELD(status, VM_CONTEXT1_PROTECTION_FAULT_STATUS,
 MEMORY_CLIENT_RW) ?
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/2] amdgpu: Use 'dev_err_ratelimited()' over 'dev_err()' on vm faults

2016-11-06 Thread Edward O'Callaghan
It can be the case that upon GPU page faults we start trashing
the logs, and so let us ratelimit here to avoid that.

Signed-off-by: Edward O'Callaghan 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 8 
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 8 
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 8 
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index b13c8aa..79af3f2 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -549,7 +549,7 @@ static void gmc_v6_0_vm_decode_fault(struct amdgpu_device 
*adev,
mc_id = REG_GET_FIELD(status, mmVM_CONTEXT1_PROTECTION_FAULT_STATUS,
  xxMEMORY_CLIENT_ID);
 
-   dev_err(adev->dev, "VM fault (0x%02x, vmid %d) at page %u, %s from '%s' 
(0x%08x) (%d)\n",
+   dev_err_ratelimited(adev->dev, "VM fault (0x%02x, vmid %d) at page %u, 
%s from '%s' (0x%08x) (%d)\n",
   protections, vmid, addr,
   REG_GET_FIELD(status, mmVM_CONTEXT1_PROTECTION_FAULT_STATUS,
 xxMEMORY_CLIENT_RW) ?
@@ -1007,11 +1007,11 @@ static int gmc_v6_0_process_interrupt(struct 
amdgpu_device *adev,
if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST)
gmc_v6_0_set_fault_enable_default(adev, false);
 
-   dev_err(adev->dev, "GPU fault detected: %d 0x%08x\n",
+   dev_err_ratelimited(adev->dev, "GPU fault detected: %d 0x%08x\n",
entry->src_id, entry->src_data);
-   dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x%08X\n",
+   dev_err_ratelimited(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_ADDR   
0x%08X\n",
addr);
-   dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x%08X\n",
+   dev_err_ratelimited(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_STATUS 
0x%08X\n",
status);
gmc_v6_0_vm_decode_fault(adev, status, addr, 0);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index a0df431..4ed97c4 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -711,7 +711,7 @@ static void gmc_v7_0_vm_decode_fault(struct amdgpu_device 
*adev,
mc_id = REG_GET_FIELD(status, VM_CONTEXT1_PROTECTION_FAULT_STATUS,
  MEMORY_CLIENT_ID);
 
-   dev_err(adev->dev, "VM fault (0x%02x, vmid %d) at page %u, %s from '%s' 
(0x%08x) (%d)\n",
+   dev_err_ratelimited(adev->dev, "VM fault (0x%02x, vmid %d) at page %u, 
%s from '%s' (0x%08x) (%d)\n",
   protections, vmid, addr,
   REG_GET_FIELD(status, VM_CONTEXT1_PROTECTION_FAULT_STATUS,
 MEMORY_CLIENT_RW) ?
@@ -1198,11 +1198,11 @@ static int gmc_v7_0_process_interrupt(struct 
amdgpu_device *adev,
if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST)
gmc_v7_0_set_fault_enable_default(adev, false);
 
-   dev_err(adev->dev, "GPU fault detected: %d 0x%08x\n",
+   dev_err_ratelimited(adev->dev, "GPU fault detected: %d 0x%08x\n",
entry->src_id, entry->src_data);
-   dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x%08X\n",
+   dev_err_ratelimited(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_ADDR   
0x%08X\n",
addr);
-   dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x%08X\n",
+   dev_err_ratelimited(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_STATUS 
0x%08X\n",
status);
gmc_v7_0_vm_decode_fault(adev, status, addr, mc_client);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 7285294..050a884 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -837,7 +837,7 @@ static void gmc_v8_0_vm_decode_fault(struct amdgpu_device 
*adev,
mc_id = REG_GET_FIELD(status, VM_CONTEXT1_PROTECTION_FAULT_STATUS,
  MEMORY_CLIENT_ID);
 
-   dev_err(adev->dev, "VM fault (0x%02x, vmid %d) at page %u, %s from '%s' 
(0x%08x) (%d)\n",
+   dev_err_ratelimited(adev->dev, "VM fault (0x%02x, vmid %d) at page %u, 
%s from '%s' (0x%08x) (%d)\n",
   protections, vmid, addr,
   REG_GET_FIELD(status, VM_CONTEXT1_PROTECTION_FAULT_STATUS,
 MEMORY_CLIENT_RW) ?
@@ -1242,11 +1242,11 @@ static int gmc_v8_0_process_interrupt(struct 
amdgpu_device *adev,
if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST)
gmc_v8_0_set_fault_enable_default(adev, false);
 
-   dev_err(adev->dev, "GPU fault detected: %d 0x%08x\n",
+   dev_err_ratelimited(adev->dev, "GPU fault detected: %d 0x%08x\n",
entry->src_id, entry->src_data);
-   dev_err(adev->dev, "  VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x%08X\n",
+   dev_err_ratelimited(adev->dev, "  VM_CONTEXT1_P

amdgpu: ratelimit on vm faults to avoid spaming console

2016-11-06 Thread Edward O'Callaghan
These are rather minor however should help stop some folks
machines grinding to a halt when a userspace application somehow
gets the GPU into some horrible state causing the console to fill
very quickly. Applies on top of master.

Please kindly review,

Edward O'Callaghan (2):
 [PATCH 1/2] amdgpu: Use dev_err() over vanilla printk() in
 [PATCH 2/2] amdgpu: Use 'dev_err_ratelimited()' over 'dev_err()' on
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Attach exclusive fence to prime exported bo's. (v3)

2016-11-06 Thread Michel Dänzer
On 07/11/16 11:47 AM, Mario Kleiner wrote:
> External clients which import our bo's wait only
> for exclusive dmabuf-fences, not on shared ones,
> so attach fences on exported buffers as exclusive
> ones, if the buffers get imported by an external
> client.
> 
> See discussion in thread:
> https://lists.freedesktop.org/archives/dri-devel/2016-October/122370.html
> 
> Tested on Intel iGPU + AMD Tonga dGPU as DRI3/Present
> Prime render offload, and with the Tonga standalone as
> primary gpu.
> 
> v2: Add a wait for all shared fences before prime export,
> as suggested by Christian Koenig.
> 
> v3: - Mark buffer prime_exported in amdgpu_gem_prime_pin,
> so we only use the exclusive fence when exporting a
> bo to external clients like a separate iGPU, but not
> when exporting/importing from/to ourselves as part of
> regular DRI3 fd passing.
> 
> - Propagate failure of reservation_object_wait_rcu back
> to caller.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95472
> (v1) Tested-by: Mike Lothian 

FWIW, I think the (v1) is usually added at the end of the line, not at
the beginning.

[...]

> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> index 7700dc2..c4494d0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> @@ -81,6 +81,20 @@ int amdgpu_gem_prime_pin(struct drm_gem_object *obj)
>  {
>   struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
>   int ret = 0;
> + long lret;
> +
> + /*
> +  * Wait for all shared fences to complete before we switch to future
> +  * use of exclusive fence on this prime_exported bo.
> +  */
> + lret = reservation_object_wait_timeout_rcu(bo->tbo.resv, true, false,
> +MAX_SCHEDULE_TIMEOUT);
> + if (unlikely(lret < 0)) {
> + DRM_DEBUG_PRIME("Fence wait failed: %li\n", lret);
> + return lret;
> + }
> +
> + bo->prime_exported = true;

We should probably clear bo->prime_exported in amdgpu_gem_prime_unpin.

Also, I think we should set bo->prime_exported (prime_shared?) in
amdgpu_gem_prime_import_sg_table as well, so that we'll also set
exclusive fences on BOs imported from other GPUs.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm/amdgpu: add the interface of waiting multiple fences (v4)

2016-11-06 Thread Gustavo Padovan
Hi Alex,

2016-11-04 Alex Deucher :

> From: Junwei Zhang 
> 
> v2: agd: rebase and squash in all the previous optimizations and
> changes so everything compiles.
> v3: squash in Slava's 32bit build fix
> v4: rebase on drm-next (fence -> dma_fence),
> squash in Monk's ioctl update patch
> 
> Signed-off-by: Junwei Zhang 
> Reviewed-by: Monk Liu 
> Reviewed-by: Jammy Zhou 
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h |   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  | 173 
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c |   1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c  |   2 +-
>  include/uapi/drm/amdgpu_drm.h   |  28 ++
>  5 files changed, 205 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index dc98ceb..7a94a3c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -1212,6 +1212,8 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev, void 
> *data,
>   struct drm_file *filp);
>  int amdgpu_cs_ioctl(struct drm_device *dev, void *data, struct drm_file 
> *filp);
>  int amdgpu_cs_wait_ioctl(struct drm_device *dev, void *data, struct drm_file 
> *filp);
> +int amdgpu_cs_wait_fences_ioctl(struct drm_device *dev, void *data,
> + struct drm_file *filp);
>  
>  int amdgpu_gem_metadata_ioctl(struct drm_device *dev, void *data,
>   struct drm_file *filp);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 2728805..2004836 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -1130,6 +1130,179 @@ int amdgpu_cs_wait_ioctl(struct drm_device *dev, void 
> *data,
>  }
>  
>  /**
> + * amdgpu_cs_get_fence - helper to get fence from drm_amdgpu_fence
> + *
> + * @adev: amdgpu device
> + * @filp: file private
> + * @user: drm_amdgpu_fence copied from user space
> + */
> +static struct dma_fence *amdgpu_cs_get_fence(struct amdgpu_device *adev,
> +  struct drm_file *filp,
> +  struct drm_amdgpu_fence *user)
> +{
> + struct amdgpu_ring *ring;
> + struct amdgpu_ctx *ctx;
> + struct dma_fence *fence;
> + int r;
> +
> + r = amdgpu_cs_get_ring(adev, user->ip_type, user->ip_instance,
> +user->ring, &ring);
> + if (r)
> + return ERR_PTR(r);
> +
> + ctx = amdgpu_ctx_get(filp->driver_priv, user->ctx_id);
> + if (ctx == NULL)
> + return ERR_PTR(-EINVAL);
> +
> + fence = amdgpu_ctx_get_fence(ctx, ring, user->seq_no);
> + amdgpu_ctx_put(ctx);
> +
> + return fence;
> +}
> +
> +/**
> + * amdgpu_cs_wait_all_fence - wait on all fences to signal
> + *
> + * @adev: amdgpu device
> + * @filp: file private
> + * @wait: wait parameters
> + * @fences: array of drm_amdgpu_fence
> + */
> +static int amdgpu_cs_wait_all_fences(struct amdgpu_device *adev,
> +  struct drm_file *filp,
> +  union drm_amdgpu_wait_fences *wait,
> +  struct drm_amdgpu_fence *fences)
> +{
> + uint32_t fence_count = wait->in.fence_count;
> + unsigned i;
> + long r = 1;
> +
> + for (i = 0; i < fence_count; i++) {
> + struct dma_fence *fence;
> + unsigned long timeout = amdgpu_gem_timeout(wait->in.timeout_ns);
> +
> + fence = amdgpu_cs_get_fence(adev, filp, &fences[i]);
> + if (IS_ERR(fence))
> + return PTR_ERR(fence);
> + else if (!fence)
> + continue;
> +
> + r = dma_fence_wait_timeout(fence, true, timeout);
> + if (r < 0)
> + return r;
> +
> + if (r == 0)
> + break;
> + }
> +
> + memset(wait, 0, sizeof(*wait));
> + wait->out.status = (r > 0);
> +
> + return 0;
> +}
> +
> +/**
> + * amdgpu_cs_wait_any_fence - wait on any fence to signal
> + *
> + * @adev: amdgpu device
> + * @filp: file private
> + * @wait: wait parameters
> + * @fences: array of drm_amdgpu_fence
> + */
> +static int amdgpu_cs_wait_any_fence(struct amdgpu_device *adev,
> + struct drm_file *filp,
> + union drm_amdgpu_wait_fences *wait,
> + struct drm_amdgpu_fence *fences)
> +{
> + unsigned long timeout = amdgpu_gem_timeout(wait->in.timeout_ns);
> + uint32_t fence_count = wait->in.fence_count;
> + uint32_t first = ~0;
> + struct dma_fence **array;
> + unsigned i;
> + long r;
> +
> + /* Prepare the fence array */
> + array = (struct dma_fence **)kcalloc(fence_count, sizeof(struct 
> dma_fence *),
> + GFP_KERNEL);
> +  

[PATCH] drm/amdgpu: Attach exclusive fence to prime exported bo's. (v3)

2016-11-06 Thread Mario Kleiner
External clients which import our bo's wait only
for exclusive dmabuf-fences, not on shared ones,
so attach fences on exported buffers as exclusive
ones, if the buffers get imported by an external
client.

See discussion in thread:
https://lists.freedesktop.org/archives/dri-devel/2016-October/122370.html

Tested on Intel iGPU + AMD Tonga dGPU as DRI3/Present
Prime render offload, and with the Tonga standalone as
primary gpu.

v2: Add a wait for all shared fences before prime export,
as suggested by Christian Koenig.

v3: - Mark buffer prime_exported in amdgpu_gem_prime_pin,
so we only use the exclusive fence when exporting a
bo to external clients like a separate iGPU, but not
when exporting/importing from/to ourselves as part of
regular DRI3 fd passing.

- Propagate failure of reservation_object_wait_rcu back
to caller.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95472
(v1) Tested-by: Mike Lothian 
Signed-off-by: Mario Kleiner 
Cc: Christian König 
Cc: Michel Dänzer 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c   | 14 ++
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 039b57e..a337d56 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -459,6 +459,7 @@ struct amdgpu_bo {
u64 metadata_flags;
void*metadata;
u32 metadata_size;
+   boolprime_exported;
/* list of all virtual address to which this bo
 * is associated to
 */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
index 651115d..51c6f60 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
@@ -132,7 +132,7 @@ static int amdgpu_bo_list_set(struct amdgpu_device *adev,
entry->priority = min(info[i].bo_priority,
  AMDGPU_BO_LIST_MAX_PRIORITY);
entry->tv.bo = &entry->robj->tbo;
-   entry->tv.shared = true;
+   entry->tv.shared = !entry->robj->prime_exported;
 
if (entry->robj->prefered_domains == AMDGPU_GEM_DOMAIN_GDS)
gds_obj = entry->robj;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
index 7700dc2..c4494d0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
@@ -81,6 +81,20 @@ int amdgpu_gem_prime_pin(struct drm_gem_object *obj)
 {
struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
int ret = 0;
+   long lret;
+
+   /*
+* Wait for all shared fences to complete before we switch to future
+* use of exclusive fence on this prime_exported bo.
+*/
+   lret = reservation_object_wait_timeout_rcu(bo->tbo.resv, true, false,
+  MAX_SCHEDULE_TIMEOUT);
+   if (unlikely(lret < 0)) {
+   DRM_DEBUG_PRIME("Fence wait failed: %li\n", lret);
+   return lret;
+   }
+
+   bo->prime_exported = true;
 
ret = amdgpu_bo_reserve(bo, false);
if (unlikely(ret != 0))
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm/amdgpu: remove extra placement for AMDGPU_GEM_CREATE_NO_CPU_ACCESS

2016-11-06 Thread zhoucm1

Hi Christian,
I have some time not to track code, I'm not sure if I miss anything.
In my mind, this change was added while doing performance optimization. 
If you don't encounter any problem, I'm suggesting not to change it, we 
have many regressions(bug and performance) recently.


Regards,
David Zhou

On 2016年11月04日 19:00, Christian König wrote:

From: Christian König 

This only has the effect of scanning the invisible range twice
since the topdown flag is given anyway.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 11 ---
  1 file changed, 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 6efa8d7..052c1b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -128,17 +128,6 @@ static void amdgpu_ttm_placement_init(struct amdgpu_device 
*adev,
if (flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)
lpfn = adev->mc.real_vram_size >> PAGE_SHIFT;
  
-		if (flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS &&

-   !(flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) &&
-   adev->mc.visible_vram_size < adev->mc.real_vram_size) {
-   places[c].fpfn = visible_pfn;
-   places[c].lpfn = lpfn;
-   places[c].flags = TTM_PL_FLAG_WC |
-   TTM_PL_FLAG_UNCACHED | TTM_PL_FLAG_VRAM |
-   TTM_PL_FLAG_TOPDOWN;
-   c++;
-   }
-
places[c].fpfn = 0;
places[c].lpfn = lpfn;
places[c].flags = TTM_PL_FLAG_WC | TTM_PL_FLAG_UNCACHED |


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm/amdgpu: remove extra placement for AMDGPU_GEM_CREATE_NO_CPU_ACCESS

2016-11-06 Thread Michel Dänzer
On 04/11/16 08:00 PM, Christian König wrote:
> From: Christian König 
> 
> This only has the effect of scanning the invisible range twice
> since the topdown flag is given anyway.
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 11 ---
>  1 file changed, 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 6efa8d7..052c1b0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -128,17 +128,6 @@ static void amdgpu_ttm_placement_init(struct 
> amdgpu_device *adev,
>   if (flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)
>   lpfn = adev->mc.real_vram_size >> PAGE_SHIFT;
>  
> - if (flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS &&
> - !(flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) &&
> - adev->mc.visible_vram_size < adev->mc.real_vram_size) {
> - places[c].fpfn = visible_pfn;
> - places[c].lpfn = lpfn;
> - places[c].flags = TTM_PL_FLAG_WC |
> - TTM_PL_FLAG_UNCACHED | TTM_PL_FLAG_VRAM |
> - TTM_PL_FLAG_TOPDOWN;
> - c++;
> - }
> -
>   places[c].fpfn = 0;
>   places[c].lpfn = lpfn;
>   places[c].flags = TTM_PL_FLAG_WC | TTM_PL_FLAG_UNCACHED |
> 

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: fix logic error for checking amdgpu_vram_page_split

2016-11-06 Thread jimqu
Change-Id: I8b89da005b8c312a13d5df0c7f82a7921410797e
Signed-off-by: JimQu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index d7136a7..5970c40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1031,8 +1031,8 @@ static void amdgpu_check_arguments(struct amdgpu_device 
*adev)
amdgpu_vm_block_size = 9;
}
 
-   if ((amdgpu_vram_page_split != -1 && amdgpu_vram_page_split < 16) ||
-   !amdgpu_check_pot_argument(amdgpu_vram_page_split)) {
+   if (amdgpu_vram_page_split != -1 && (amdgpu_vram_page_split < 16 ||
+   !amdgpu_check_pot_argument(amdgpu_vram_page_split))) {
dev_warn(adev->dev, "invalid VRAM page split (%d)\n",
 amdgpu_vram_page_split);
amdgpu_vram_page_split = 1024;
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


答复: [PATCH 1/2] drm/amdgpu: disable the VRAM manager on special placements v2

2016-11-06 Thread Qu, Jim
HI Christian:

if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS ||
-   amdgpu_vram_page_split == -1) {
+   place->lpfn || amdgpu_vram_page_split == -1) {
pages_per_node = ~0ul;
num_nodes = 1;
} else {

Here , checking place->lpfn , is that mean if required place is located in 
visible vram, driver run none pages split code path?
if so, seem it comes back to the starting point. In extreme cases, allocate 
visible vram fail.

Thanks
JimQu


发件人: amd-gfx  代表 Christian König 

发送时间: 2016年11月4日 19:00
收件人: amd-gfx@lists.freedesktop.org
主题: [PATCH 1/2] drm/amdgpu: disable the VRAM manager on special placements v2

From: Christian König 

This disables the VRAM manager when a special placement is requested, otherwise
we play ping/pong with the buffers on every command submission.

v2: only check lpfn

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 180eed7c..d710226 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -108,7 +108,7 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager 
*man,
lpfn = man->size;

if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS ||
-   amdgpu_vram_page_split == -1) {
+   place->lpfn || amdgpu_vram_page_split == -1) {
pages_per_node = ~0ul;
num_nodes = 1;
} else {
--
2.5.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] Fix RadeonCopyData bpp=2 case for big-endian

2016-11-06 Thread Michel Dänzer
On 04/11/16 05:21 PM, Jochen Rollwagen wrote:
> From 66b8b1513464aa3258ae6a024fcaea7a02e2def0 Mon Sep 17 00:00:00 2001
> From: Jochen Rollwagen 
> Date: Fri, 4 Nov 2016 09:11:38 +0100
> Subject: [PATCH] Fix RadeonCopyData bpp=2 case for big-endian
> 
> The current code in RadeonCopyData blocks the bpp=2 case setting
> swappiness to RADEON_HOST_DATA_SWAP_16BIT.
> This patch fixes this.
> ---
>  src/radeon_video.c |   10 ++
>  1 file changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/src/radeon_video.c b/src/radeon_video.c
> index d058986..2de0b48 100644
> --- a/src/radeon_video.c
> +++ b/src/radeon_video.c
> @@ -198,15 +198,10 @@ RADEONCopyData(
>unsigned int w,
>unsigned int bpp
>  ){
> -/* Get the byte-swapping right for big endian systems */
> -if ( bpp == 2 ) {
> -w *= 2;
> -bpp = 1;
> -}

Unless this actually fixes anything for you, NAK, see
https://cgit.freedesktop.org/xorg/driver/xf86-video-ati/commit/?id=5249f450a2487475a95531603cc8668db2c21c33
.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] Remove RADEON_HOST_DATA_SWAP_HDW case from RadeonCopySwap

2016-11-06 Thread Michel Dänzer
On 04/11/16 05:23 PM, Jochen Rollwagen wrote:
> From ba45efaafc3cf790c44b905d2f6272ef7830b403 Mon Sep 17 00:00:00 2001
> From: Jochen Rollwagen 
> Date: Fri, 4 Nov 2016 08:39:30 +0100
> Subject: [PATCH] Remove RADEON_HOST_DATA_SWAP_HDW case from RadeonCopySwap
> 
> RadeonCopySwap is never called with swap=RADEON_HOST_DATA_SWAP_HDW.
> Remove the case from the switch for clarity.
> ---
>  src/radeon_accel.c |   10 --
>  1 file changed, 10 deletions(-)
> 
> diff --git a/src/radeon_accel.c b/src/radeon_accel.c
> index 1def2a3..af2fc99 100644
> --- a/src/radeon_accel.c
> +++ b/src/radeon_accel.c
> @@ -131,16 +131,6 @@ int radeon_cs_space_remaining(ScrnInfoPtr pScrn)
>  void RADEONCopySwap(uint8_t *dst, uint8_t *src, unsigned int size, int
> swap)
>  {
>  switch(swap) {
> -case RADEON_HOST_DATA_SWAP_HDW:
> -{
> -unsigned int *d = (unsigned int *)dst;
> -unsigned int *s = (unsigned int *)src;
> -unsigned int nwords = size >> 2;
> -
> -for (; nwords > 0; --nwords, ++d, ++s)
> -*d = ((*s & 0x) << 16) | ((*s >> 16) & 0x);
> -return;
> -}
>  case RADEON_HOST_DATA_SWAP_32BIT:
>  {
>  unsigned int *d = (unsigned int *)dst;

It'd be better to keep something like

case RADEON_HOST_DATA_SWAP_HDW:
FatalError("Unsupported swap value RADEON_HOST_DATA_SWAP_HDW\n");

so that if somebody makes a change which causes this value to be passed
in, there's an obvious failure (Xorg aborts with the message passed to
FatalError) instead of a subtle one (RADEONCopySwap uses different byte
swapping than intended).


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: set bypass mode when uvd is idle.

2016-11-06 Thread Andy Furniss

Zhu, Rex wrote:

Is there any harm in just always putting it into bypass mode or
does it interact badly with PG?  Presumably it does (otherwise
we wouldn't need this patch), it would be good to note why.


Rex: when UVD PG enabled, DCLK/VCLK will be turn off when uvd is
idle(DCLK=OFF). If we set bypass mode=1, dclk/vclk will be bypassed
to an external ‘Bypass’ clock(DCLK = 100MHz)

So it is unnecessary to set bypass mode when PG enabled.

+uvd_v5_0_set_bypass_mode(adev, !enable); This change is because
tom's commit 72cb64c1f6a3a8129af341e90418a687c4971a40 Fix the
sequence of UVD powergate function in smu7_clockgating.c.


I was about to file a bug till I tried this which fixes UVD perf
on my R9285 + agd5f drm-next-4.10-wip.

Additional unrelated question = I notice that UVD does not seem
to set other clocks quite high enough when used.

For playback the vo may bump things up a bit, but even then it can be a bit
borderline for playing high bitrate UHD with powerplay on auto.

Pure decode benchmarks like

ffmpeg -hwaccel vdpau -i high-bitrate-2160p60-vid -pix_fmt nv12 -f null -

go from 63 -> 81 fps, powerplay auto -> high.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx