Re: [PATCH] drm/amdgpu: fix NULL pointer dereference when run App with DRI_PRIME=1

2018-05-28 Thread Zhang, Jerry (Junwei)

On 05/25/2018 07:23 PM, Christian König wrote:

Am 25.05.2018 um 11:51 schrieb Zhang, Jerry (Junwei):

On 05/25/2018 05:35 PM, Christian König wrote:

Am 25.05.2018 um 10:23 schrieb Zhang, Jerry (Junwei):

On 05/25/2018 03:54 PM, Christian König wrote:

Am 25.05.2018 um 09:20 schrieb Zhang, Jerry (Junwei):

On 05/25/2018 02:44 PM, Christian König wrote:

NAK, that probably just fixed the symptom but not the underlying problem.

Somebody is accessing the page array when it should never be accessed.


If prime import as GTT bo by default(now it's CPU bo), it would happens
quickly when GTT sg bo creation rather than next cs validation.

Since ttm_sg_tt_init() only allocates gtt->ttm.dma_address if sg bo is
created, it would fail to access ttm->pages when ttm populate.


And exactly that's the problem, and imported BO should never populate.



current error happens in ttm populate from cs validation, the sg bo is
imported from exporter.



How did you manage to trigger this?


PRI_PRIME=1 with Unigine heaven.


Going to give that a try, but the last time I check that worked as expected.


FYI.
PRI_PRIME=1 glxinfo will not trigger that, but the game does.


Just tested and it works perfectly fine.

Is that on the closed stack or the open stack?


I used unified driver(latest 18.20 build) + drm-next kernel, installed as all
open stack on A+A platform.
(issue was found by 18.20 build, all open stack(dkms driver))

BTW, How did you get the UMD? apt-get or build by yourself?


That's self build Mesa+libdrm.

Do you have the apt url and/or package versions at hand you used for the test?


I found that the Ubuntu kernel 4.13/4.15 has no below patch:
  * 
https://cgit.freedesktop.org/~agd5f/linux/commit/?h=amd-staging-drm-next&id=186ca446aea19e49d2e1433dd170c6e1c211a52a


So we could fix that in DKMS support rather than in upstream.

Double confirmed drm-next kernel that has no such issue.
(not sure what's going on last week, I did get the latest code and build the 
kernel and it failed. Sorry for this inconvenience)


Thanks for your time to check it.

Jerry



Christian.




Jerry



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix NULL pointer dereference when run App with DRI_PRIME=1

2018-05-25 Thread Christian König

Am 25.05.2018 um 11:51 schrieb Zhang, Jerry (Junwei):

On 05/25/2018 05:35 PM, Christian König wrote:

Am 25.05.2018 um 10:23 schrieb Zhang, Jerry (Junwei):

On 05/25/2018 03:54 PM, Christian König wrote:

Am 25.05.2018 um 09:20 schrieb Zhang, Jerry (Junwei):

On 05/25/2018 02:44 PM, Christian König wrote:
NAK, that probably just fixed the symptom but not the underlying 
problem.


Somebody is accessing the page array when it should never be 
accessed.


If prime import as GTT bo by default(now it's CPU bo), it would 
happens

quickly when GTT sg bo creation rather than next cs validation.

Since ttm_sg_tt_init() only allocates gtt->ttm.dma_address if sg 
bo is

created, it would fail to access ttm->pages when ttm populate.


And exactly that's the problem, and imported BO should never populate.



current error happens in ttm populate from cs validation, the sg 
bo is

imported from exporter.



How did you manage to trigger this?


PRI_PRIME=1 with Unigine heaven.


Going to give that a try, but the last time I check that worked as 
expected.


FYI.
PRI_PRIME=1 glxinfo will not trigger that, but the game does.


Just tested and it works perfectly fine.

Is that on the closed stack or the open stack?


I used unified driver(latest 18.20 build) + drm-next kernel, installed 
as all open stack on A+A platform.

(issue was found by 18.20 build, all open stack(dkms driver))

BTW, How did you get the UMD? apt-get or build by yourself?


That's self build Mesa+libdrm.

Do you have the apt url and/or package versions at hand you used for the 
test?


Christian.




Jerry


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix NULL pointer dereference when run App with DRI_PRIME=1

2018-05-25 Thread Zhang, Jerry (Junwei)

On 05/25/2018 05:35 PM, Christian König wrote:

Am 25.05.2018 um 10:23 schrieb Zhang, Jerry (Junwei):

On 05/25/2018 03:54 PM, Christian König wrote:

Am 25.05.2018 um 09:20 schrieb Zhang, Jerry (Junwei):

On 05/25/2018 02:44 PM, Christian König wrote:

NAK, that probably just fixed the symptom but not the underlying problem.

Somebody is accessing the page array when it should never be accessed.


If prime import as GTT bo by default(now it's CPU bo), it would happens
quickly when GTT sg bo creation rather than next cs validation.

Since ttm_sg_tt_init() only allocates gtt->ttm.dma_address if sg bo is
created, it would fail to access ttm->pages when ttm populate.


And exactly that's the problem, and imported BO should never populate.



current error happens in ttm populate from cs validation, the sg bo is
imported from exporter.



How did you manage to trigger this?


PRI_PRIME=1 with Unigine heaven.


Going to give that a try, but the last time I check that worked as expected.


FYI.
PRI_PRIME=1 glxinfo will not trigger that, but the game does.


Just tested and it works perfectly fine.

Is that on the closed stack or the open stack?


I used unified driver(latest 18.20 build) + drm-next kernel, installed as all 
open stack on A+A platform.

(issue was found by 18.20 build, all open stack(dkms driver))

BTW, How did you get the UMD? apt-get or build by yourself?


Jerry



Christian.



Jerry



Thanks,
Christian.



Regards,
Jerry



Regards,
Christian.

Am 25.05.2018 um 07:41 schrieb Junwei Zhang:

[  632.679861] BUG: unable to handle kernel NULL pointer dereference at
(null)
[  632.679892] IP: drm_prime_sg_to_page_addr_arrays+0x52/0xb0 [drm]

[  632.680011] Call Trace:
[  632.680082]  amdgpu_ttm_tt_populate+0x3e/0xa0 [amdgpu]
[  632.680092]  ttm_tt_populate.part.7+0x22/0x60 [amdttm]
[  632.680098]  amdttm_tt_bind+0x52/0x60 [amdttm]
[  632.680106]  ttm_bo_handle_move_mem+0x54b/0x5c0 [amdttm]
[  632.680112]  ? find_next_bit+0xb/0x10
[  632.680119]  amdttm_bo_validate+0x11d/0x130 [amdttm]
[  632.680176]  amdgpu_cs_bo_validate+0x9d/0x150 [amdgpu]
[  632.680232]  amdgpu_cs_validate+0x41/0x270 [amdgpu]
[  632.680288]  amdgpu_cs_list_validate+0xc7/0x1a0 [amdgpu]
[  632.680343]  amdgpu_cs_ioctl+0x1634/0x1c00 [amdgpu]
[  632.680401]  ? amdgpu_cs_find_mapping+0x120/0x120 [amdgpu]
[  632.680416]  drm_ioctl_kernel+0x6b/0xb0 [drm]
[  632.680431]  drm_ioctl+0x3e4/0x450 [drm]
[  632.680485]  ? amdgpu_cs_find_mapping+0x120/0x120 [amdgpu]
[  632.680537]  amdgpu_drm_ioctl+0x4c/0x80 [amdgpu]
[  632.680542]  do_vfs_ioctl+0xa4/0x600
[  632.680546]  ? SyS_futex+0x7f/0x180
[  632.680549]  SyS_ioctl+0x79/0x90
[  632.680554]  entry_SYSCALL_64_fastpath+0x24/0xab

Signed-off-by: Junwei Zhang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 57d4da6..b293809 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1212,7 +1212,7 @@ static struct ttm_tt *amdgpu_ttm_tt_create(struct
ttm_buffer_object *bo,
  gtt->ttm.ttm.func = &amdgpu_backend_func;
  /* allocate space for the uninitialized page entries */
-if (ttm_sg_tt_init(>t->ttm, bo, page_flags)) {
+if (ttm_dma_tt_init(>t->ttm, bo, page_flags)) {
  kfree(gtt);
  return NULL;
  }







___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix NULL pointer dereference when run App with DRI_PRIME=1

2018-05-25 Thread Christian König

Am 25.05.2018 um 10:23 schrieb Zhang, Jerry (Junwei):

On 05/25/2018 03:54 PM, Christian König wrote:

Am 25.05.2018 um 09:20 schrieb Zhang, Jerry (Junwei):

On 05/25/2018 02:44 PM, Christian König wrote:
NAK, that probably just fixed the symptom but not the underlying 
problem.


Somebody is accessing the page array when it should never be accessed.


If prime import as GTT bo by default(now it's CPU bo), it would happens
quickly when GTT sg bo creation rather than next cs validation.

Since ttm_sg_tt_init() only allocates gtt->ttm.dma_address if sg bo is
created, it would fail to access ttm->pages when ttm populate.


And exactly that's the problem, and imported BO should never populate.



current error happens in ttm populate from cs validation, the sg bo is
imported from exporter.



How did you manage to trigger this?


PRI_PRIME=1 with Unigine heaven.


Going to give that a try, but the last time I check that worked as 
expected.


FYI.
PRI_PRIME=1 glxinfo will not trigger that, but the game does.


Just tested and it works perfectly fine.

Is that on the closed stack or the open stack?

Christian.



Jerry



Thanks,
Christian.



Regards,
Jerry



Regards,
Christian.

Am 25.05.2018 um 07:41 schrieb Junwei Zhang:
[  632.679861] BUG: unable to handle kernel NULL pointer 
dereference at (null)

[  632.679892] IP: drm_prime_sg_to_page_addr_arrays+0x52/0xb0 [drm]

[  632.680011] Call Trace:
[  632.680082]  amdgpu_ttm_tt_populate+0x3e/0xa0 [amdgpu]
[  632.680092]  ttm_tt_populate.part.7+0x22/0x60 [amdttm]
[  632.680098]  amdttm_tt_bind+0x52/0x60 [amdttm]
[  632.680106]  ttm_bo_handle_move_mem+0x54b/0x5c0 [amdttm]
[  632.680112]  ? find_next_bit+0xb/0x10
[  632.680119]  amdttm_bo_validate+0x11d/0x130 [amdttm]
[  632.680176]  amdgpu_cs_bo_validate+0x9d/0x150 [amdgpu]
[  632.680232]  amdgpu_cs_validate+0x41/0x270 [amdgpu]
[  632.680288]  amdgpu_cs_list_validate+0xc7/0x1a0 [amdgpu]
[  632.680343]  amdgpu_cs_ioctl+0x1634/0x1c00 [amdgpu]
[  632.680401]  ? amdgpu_cs_find_mapping+0x120/0x120 [amdgpu]
[  632.680416]  drm_ioctl_kernel+0x6b/0xb0 [drm]
[  632.680431]  drm_ioctl+0x3e4/0x450 [drm]
[  632.680485]  ? amdgpu_cs_find_mapping+0x120/0x120 [amdgpu]
[  632.680537]  amdgpu_drm_ioctl+0x4c/0x80 [amdgpu]
[  632.680542]  do_vfs_ioctl+0xa4/0x600
[  632.680546]  ? SyS_futex+0x7f/0x180
[  632.680549]  SyS_ioctl+0x79/0x90
[  632.680554]  entry_SYSCALL_64_fastpath+0x24/0xab

Signed-off-by: Junwei Zhang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 57d4da6..b293809 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1212,7 +1212,7 @@ static struct ttm_tt 
*amdgpu_ttm_tt_create(struct

ttm_buffer_object *bo,
  gtt->ttm.ttm.func = &amdgpu_backend_func;
  /* allocate space for the uninitialized page entries */
-    if (ttm_sg_tt_init(>t->ttm, bo, page_flags)) {
+    if (ttm_dma_tt_init(>t->ttm, bo, page_flags)) {
  kfree(gtt);
  return NULL;
  }






___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix NULL pointer dereference when run App with DRI_PRIME=1

2018-05-25 Thread Zhang, Jerry (Junwei)

On 05/25/2018 03:54 PM, Christian König wrote:

Am 25.05.2018 um 09:20 schrieb Zhang, Jerry (Junwei):

On 05/25/2018 02:44 PM, Christian König wrote:

NAK, that probably just fixed the symptom but not the underlying problem.

Somebody is accessing the page array when it should never be accessed.


If prime import as GTT bo by default(now it's CPU bo), it would happens
quickly when GTT sg bo creation rather than next cs validation.

Since ttm_sg_tt_init() only allocates gtt->ttm.dma_address if sg bo is
created, it would fail to access ttm->pages when ttm populate.


And exactly that's the problem, and imported BO should never populate.



current error happens in ttm populate from cs validation, the sg bo is
imported from exporter.



How did you manage to trigger this?


PRI_PRIME=1 with Unigine heaven.


Going to give that a try, but the last time I check that worked as expected.


FYI.
PRI_PRIME=1 glxinfo will not trigger that, but the game does.

Jerry



Thanks,
Christian.



Regards,
Jerry



Regards,
Christian.

Am 25.05.2018 um 07:41 schrieb Junwei Zhang:

[  632.679861] BUG: unable to handle kernel NULL pointer dereference at (null)
[  632.679892] IP: drm_prime_sg_to_page_addr_arrays+0x52/0xb0 [drm]

[  632.680011] Call Trace:
[  632.680082]  amdgpu_ttm_tt_populate+0x3e/0xa0 [amdgpu]
[  632.680092]  ttm_tt_populate.part.7+0x22/0x60 [amdttm]
[  632.680098]  amdttm_tt_bind+0x52/0x60 [amdttm]
[  632.680106]  ttm_bo_handle_move_mem+0x54b/0x5c0 [amdttm]
[  632.680112]  ? find_next_bit+0xb/0x10
[  632.680119]  amdttm_bo_validate+0x11d/0x130 [amdttm]
[  632.680176]  amdgpu_cs_bo_validate+0x9d/0x150 [amdgpu]
[  632.680232]  amdgpu_cs_validate+0x41/0x270 [amdgpu]
[  632.680288]  amdgpu_cs_list_validate+0xc7/0x1a0 [amdgpu]
[  632.680343]  amdgpu_cs_ioctl+0x1634/0x1c00 [amdgpu]
[  632.680401]  ? amdgpu_cs_find_mapping+0x120/0x120 [amdgpu]
[  632.680416]  drm_ioctl_kernel+0x6b/0xb0 [drm]
[  632.680431]  drm_ioctl+0x3e4/0x450 [drm]
[  632.680485]  ? amdgpu_cs_find_mapping+0x120/0x120 [amdgpu]
[  632.680537]  amdgpu_drm_ioctl+0x4c/0x80 [amdgpu]
[  632.680542]  do_vfs_ioctl+0xa4/0x600
[  632.680546]  ? SyS_futex+0x7f/0x180
[  632.680549]  SyS_ioctl+0x79/0x90
[  632.680554]  entry_SYSCALL_64_fastpath+0x24/0xab

Signed-off-by: Junwei Zhang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 57d4da6..b293809 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1212,7 +1212,7 @@ static struct ttm_tt *amdgpu_ttm_tt_create(struct
ttm_buffer_object *bo,
  gtt->ttm.ttm.func = &amdgpu_backend_func;
  /* allocate space for the uninitialized page entries */
-if (ttm_sg_tt_init(>t->ttm, bo, page_flags)) {
+if (ttm_dma_tt_init(>t->ttm, bo, page_flags)) {
  kfree(gtt);
  return NULL;
  }





___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix NULL pointer dereference when run App with DRI_PRIME=1

2018-05-25 Thread Christian König

Am 25.05.2018 um 09:20 schrieb Zhang, Jerry (Junwei):

On 05/25/2018 02:44 PM, Christian König wrote:
NAK, that probably just fixed the symptom but not the underlying 
problem.


Somebody is accessing the page array when it should never be accessed.


If prime import as GTT bo by default(now it's CPU bo), it would 
happens quickly when GTT sg bo creation rather than next cs validation.


Since ttm_sg_tt_init() only allocates gtt->ttm.dma_address if sg bo is 
created, it would fail to access ttm->pages when ttm populate.


And exactly that's the problem, and imported BO should never populate.



current error happens in ttm populate from cs validation, the sg bo is 
imported from exporter.




How did you manage to trigger this?


PRI_PRIME=1 with Unigine heaven.


Going to give that a try, but the last time I check that worked as expected.

Thanks,
Christian.



Regards,
Jerry



Regards,
Christian.

Am 25.05.2018 um 07:41 schrieb Junwei Zhang:
[  632.679861] BUG: unable to handle kernel NULL pointer dereference 
at (null)

[  632.679892] IP: drm_prime_sg_to_page_addr_arrays+0x52/0xb0 [drm]

[  632.680011] Call Trace:
[  632.680082]  amdgpu_ttm_tt_populate+0x3e/0xa0 [amdgpu]
[  632.680092]  ttm_tt_populate.part.7+0x22/0x60 [amdttm]
[  632.680098]  amdttm_tt_bind+0x52/0x60 [amdttm]
[  632.680106]  ttm_bo_handle_move_mem+0x54b/0x5c0 [amdttm]
[  632.680112]  ? find_next_bit+0xb/0x10
[  632.680119]  amdttm_bo_validate+0x11d/0x130 [amdttm]
[  632.680176]  amdgpu_cs_bo_validate+0x9d/0x150 [amdgpu]
[  632.680232]  amdgpu_cs_validate+0x41/0x270 [amdgpu]
[  632.680288]  amdgpu_cs_list_validate+0xc7/0x1a0 [amdgpu]
[  632.680343]  amdgpu_cs_ioctl+0x1634/0x1c00 [amdgpu]
[  632.680401]  ? amdgpu_cs_find_mapping+0x120/0x120 [amdgpu]
[  632.680416]  drm_ioctl_kernel+0x6b/0xb0 [drm]
[  632.680431]  drm_ioctl+0x3e4/0x450 [drm]
[  632.680485]  ? amdgpu_cs_find_mapping+0x120/0x120 [amdgpu]
[  632.680537]  amdgpu_drm_ioctl+0x4c/0x80 [amdgpu]
[  632.680542]  do_vfs_ioctl+0xa4/0x600
[  632.680546]  ? SyS_futex+0x7f/0x180
[  632.680549]  SyS_ioctl+0x79/0x90
[  632.680554]  entry_SYSCALL_64_fastpath+0x24/0xab

Signed-off-by: Junwei Zhang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 57d4da6..b293809 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1212,7 +1212,7 @@ static struct ttm_tt *amdgpu_ttm_tt_create(struct
ttm_buffer_object *bo,
  gtt->ttm.ttm.func = &amdgpu_backend_func;
  /* allocate space for the uninitialized page entries */
-    if (ttm_sg_tt_init(>t->ttm, bo, page_flags)) {
+    if (ttm_dma_tt_init(>t->ttm, bo, page_flags)) {
  kfree(gtt);
  return NULL;
  }




___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix NULL pointer dereference when run App with DRI_PRIME=1

2018-05-25 Thread Zhang, Jerry (Junwei)

On 05/25/2018 02:44 PM, Christian König wrote:

NAK, that probably just fixed the symptom but not the underlying problem.

Somebody is accessing the page array when it should never be accessed.


If prime import as GTT bo by default(now it's CPU bo), it would happens quickly 
when GTT sg bo creation rather than next cs validation.


Since ttm_sg_tt_init() only allocates gtt->ttm.dma_address if sg bo is created, 
it would fail to access ttm->pages when ttm populate.


current error happens in ttm populate from cs validation, the sg bo is imported 
from exporter.




How did you manage to trigger this?


PRI_PRIME=1 with Unigine heaven.

Regards,
Jerry



Regards,
Christian.

Am 25.05.2018 um 07:41 schrieb Junwei Zhang:

[  632.679861] BUG: unable to handle kernel NULL pointer dereference at (null)
[  632.679892] IP: drm_prime_sg_to_page_addr_arrays+0x52/0xb0 [drm]

[  632.680011] Call Trace:
[  632.680082]  amdgpu_ttm_tt_populate+0x3e/0xa0 [amdgpu]
[  632.680092]  ttm_tt_populate.part.7+0x22/0x60 [amdttm]
[  632.680098]  amdttm_tt_bind+0x52/0x60 [amdttm]
[  632.680106]  ttm_bo_handle_move_mem+0x54b/0x5c0 [amdttm]
[  632.680112]  ? find_next_bit+0xb/0x10
[  632.680119]  amdttm_bo_validate+0x11d/0x130 [amdttm]
[  632.680176]  amdgpu_cs_bo_validate+0x9d/0x150 [amdgpu]
[  632.680232]  amdgpu_cs_validate+0x41/0x270 [amdgpu]
[  632.680288]  amdgpu_cs_list_validate+0xc7/0x1a0 [amdgpu]
[  632.680343]  amdgpu_cs_ioctl+0x1634/0x1c00 [amdgpu]
[  632.680401]  ? amdgpu_cs_find_mapping+0x120/0x120 [amdgpu]
[  632.680416]  drm_ioctl_kernel+0x6b/0xb0 [drm]
[  632.680431]  drm_ioctl+0x3e4/0x450 [drm]
[  632.680485]  ? amdgpu_cs_find_mapping+0x120/0x120 [amdgpu]
[  632.680537]  amdgpu_drm_ioctl+0x4c/0x80 [amdgpu]
[  632.680542]  do_vfs_ioctl+0xa4/0x600
[  632.680546]  ? SyS_futex+0x7f/0x180
[  632.680549]  SyS_ioctl+0x79/0x90
[  632.680554]  entry_SYSCALL_64_fastpath+0x24/0xab

Signed-off-by: Junwei Zhang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 57d4da6..b293809 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1212,7 +1212,7 @@ static struct ttm_tt *amdgpu_ttm_tt_create(struct
ttm_buffer_object *bo,
  gtt->ttm.ttm.func = &amdgpu_backend_func;
  /* allocate space for the uninitialized page entries */
-if (ttm_sg_tt_init(>t->ttm, bo, page_flags)) {
+if (ttm_dma_tt_init(>t->ttm, bo, page_flags)) {
  kfree(gtt);
  return NULL;
  }



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix NULL pointer dereference when run App with DRI_PRIME=1

2018-05-24 Thread Christian König

NAK, that probably just fixed the symptom but not the underlying problem.

Somebody is accessing the page array when it should never be accessed.

How did you manage to trigger this?

Regards,
Christian.

Am 25.05.2018 um 07:41 schrieb Junwei Zhang:

[  632.679861] BUG: unable to handle kernel NULL pointer dereference at (null)
[  632.679892] IP: drm_prime_sg_to_page_addr_arrays+0x52/0xb0 [drm]

[  632.680011] Call Trace:
[  632.680082]  amdgpu_ttm_tt_populate+0x3e/0xa0 [amdgpu]
[  632.680092]  ttm_tt_populate.part.7+0x22/0x60 [amdttm]
[  632.680098]  amdttm_tt_bind+0x52/0x60 [amdttm]
[  632.680106]  ttm_bo_handle_move_mem+0x54b/0x5c0 [amdttm]
[  632.680112]  ? find_next_bit+0xb/0x10
[  632.680119]  amdttm_bo_validate+0x11d/0x130 [amdttm]
[  632.680176]  amdgpu_cs_bo_validate+0x9d/0x150 [amdgpu]
[  632.680232]  amdgpu_cs_validate+0x41/0x270 [amdgpu]
[  632.680288]  amdgpu_cs_list_validate+0xc7/0x1a0 [amdgpu]
[  632.680343]  amdgpu_cs_ioctl+0x1634/0x1c00 [amdgpu]
[  632.680401]  ? amdgpu_cs_find_mapping+0x120/0x120 [amdgpu]
[  632.680416]  drm_ioctl_kernel+0x6b/0xb0 [drm]
[  632.680431]  drm_ioctl+0x3e4/0x450 [drm]
[  632.680485]  ? amdgpu_cs_find_mapping+0x120/0x120 [amdgpu]
[  632.680537]  amdgpu_drm_ioctl+0x4c/0x80 [amdgpu]
[  632.680542]  do_vfs_ioctl+0xa4/0x600
[  632.680546]  ? SyS_futex+0x7f/0x180
[  632.680549]  SyS_ioctl+0x79/0x90
[  632.680554]  entry_SYSCALL_64_fastpath+0x24/0xab

Signed-off-by: Junwei Zhang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 57d4da6..b293809 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1212,7 +1212,7 @@ static struct ttm_tt *amdgpu_ttm_tt_create(struct 
ttm_buffer_object *bo,
gtt->ttm.ttm.func = &amdgpu_backend_func;
  
  	/* allocate space for the uninitialized page entries */

-   if (ttm_sg_tt_init(>t->ttm, bo, page_flags)) {
+   if (ttm_dma_tt_init(>t->ttm, bo, page_flags)) {
kfree(gtt);
return NULL;
}


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx