Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
Did find way to reproduce issue constantly. After applying David's patch "0001-drm-amdgpu-fix-signaled-fence-isn-t-handled" with minor change -static struct dma_fence *drm_syncobj_get_stub_fence(void) +struct dma_fence *drm_syncobj_get_stub_fence(void) was able to avoid kernel panic due to NULL pointer dereference. So seems it is resolving the problem which I was trying to address. Can you please put revision of this patch? Though only one time I got reboot with dmesg.. [ 213.714324] RIP: 0010:reservation_object_add_shared_fence+0x22e/0x245 [ 213.714331] RSP: 0018:a03900b5ba80 EFLAGS: 00010246 [ 213.714338] RAX: 0004 RBX: a03900b5bba8 RCX: 969421139d00 [ 213.714343] RDX: RSI: 9693639662e0 RDI: 9694203db230 [ 213.714349] RBP: a03900b5bad0 R08: 002a R09: 969363966280 [ 213.714354] R10: 000180800079 R11: a3637f66 R12: 96941ea5cf00 [ 213.714360] R13: R14: 9693639662e0 R15: 9694203db230 [ 213.714366] FS: 786009e5f700() GS:96942ec0() knlGS: [ 213.714372] CS: 0010 DS: ES: CR0: 80050033 [ 213.714377] CR2: 7860023a4000 CR3: 00011ea8a000 CR4: 001406f0 [ 213.714382] Call Trace: [ 213.714397] ttm_eu_fence_buffer_objects+0x53/0x89 [ 213.714407] amdgpu_cs_ioctl+0x194f/0x196c [ 213.714420] ? amdgpu_cs_report_moved_bytes+0x5f/0x5f [ 213.714427] drm_ioctl_kernel+0x98/0xac [ 213.714435] drm_ioctl+0x235/0x357 [ 213.714443] ? amdgpu_cs_report_moved_bytes+0x5f/0x5f [ 213.714450] ? __switch_to_asm+0x34/0x70 [ 213.714456] ? __switch_to_asm+0x40/0x70 [ 213.714462] ? __switch_to_asm+0x34/0x70 [ 213.714468] ? __switch_to_asm+0x40/0x70 [ 213.714476] amdgpu_drm_ioctl+0x46/0x78 [ 213.714500] vfs_ioctl+0x1b/0x30 [ 213.714507] do_vfs_ioctl+0x492/0x6c1 [ 213.714530] SyS_ioctl+0x52/0x77 [ 213.714537] do_syscall_64+0x6d/0x81 [ 213.714544] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 213.714551] RIP: 0033:0x78600c9c6967 [ 213.714556] RSP: 002b:786009e5ec08 EFLAGS: 0246 ORIG_RAX: 0010 [ 213.714563] RAX: ffda RBX: c0186444 RCX: 78600c9c6967 [ 213.714568] RDX: 786009e5ec60 RSI: c0186444 RDI: 000d [ 213.714573] RBP: 786009e5ec40 R08: 786009e5ed00 R09: 786009e5ec50 [ 213.714578] R10: R11: 0246 R12: 000d [ 213.714583] R13: 3ba2573ff128 R14: R15: 786009e5ec60 Its worth to mention that I am not using latest kernel .relevant git log 71a0556255de125b7e3fc0dc6171fb31fab2b9ad drm/syncobj: Don't leak fences when WAIT_FOR_SUBMIT is set 4034318d2afac8f7f43457a4b480b877a94ec651 drm/syncobj: Stop reusing the same struct file for all syncobj -> fd So have not taken any recent changes from https://patchwork.kernel.org/patch/10575275/ series. only backported attached two patches . I need to check if something is missing that can cause issue. Thanks, Deepak From: Sharma, Deepak Sent: Tuesday, November 27, 2018 3:12 PM To: Zhou, David(ChunMing); Koenig, Christian; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference Thanks David , I did backport the drm patch to my kernel after that I am not getting -ENOMEM from amdgpu_cs_ioctl while running tests so have not been able to test patch to handle signaled fence. As this issue is hard to reproduce , will give some more try. But yes the problem is there and need to handle when fence is null that your patch seems to handle it correctly. -Deepak From: Zhou, David(ChunMing) Sent: Monday, November 26, 2018 6:40 PM To: Sharma, Deepak; Zhou, David(ChunMing); Koenig, Christian; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference Yeah, you need another drm patch as well when you apply my patch. Attached. -David On 2018年11月27日 08:40, Sharma, Deepak wrote: > > On 11/26/18 1:57 AM, Zhou, David(ChunMing) wrote: >> >>> -Original Message- >>> From: Christian König >>> Sent: Monday, November 26, 2018 5:23 PM >>> To: Sharma, Deepak ; Zhou, David(ChunMing) >>> ; Koenig, Christian ; >>> amd-gfx@lists.freedesktop.org >>> Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer >>> dereference >>> >>> Am 26.11.18 um 02:59 schrieb Sharma, Deepak: >>>> 在 2018/11/24 2:10, Koenig, Christian 写道: >>>>> Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing): >>>>>> 在 2018/11/23 21:30, Koenig, Christian 写道: >>>>>>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing): >>>>>>>> 在 2018/11/22 19:25, Christian König 写道:
RE: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
-Original Message- From: amd-gfx On Behalf Of Christian König Sent: Tuesday, November 27, 2018 1:52 AM To: Sharma, Deepak ; Zhou, David(ChunMing) ; Koenig, Christian ; Sharma, Deepak ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference Am 27.11.18 um 01:40 schrieb Deepak Sharma: > > On 11/26/18 1:57 AM, Zhou, David(ChunMing) wrote: >> >>> -Original Message- >>> From: Christian König >>> Sent: Monday, November 26, 2018 5:23 PM >>> To: Sharma, Deepak ; Zhou, David(ChunMing) >>> ; Koenig, Christian ; >>> amd-gfx@lists.freedesktop.org >>> Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer >>> dereference >>> >>> Am 26.11.18 um 02:59 schrieb Sharma, Deepak: >>>> 在 2018/11/24 2:10, Koenig, Christian 写道: >>>>> Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing): >>>>>> 在 2018/11/23 21:30, Koenig, Christian 写道: >>>>>>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing): >>>>>>>> 在 2018/11/22 19:25, Christian König 写道: >>>>>>>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak: >>>>>>>>>> when returned fence is not valid mostly due to userspace ignored >>>>>>>>>> previous error causes NULL pointer dereference. >>>>>>>>> Again, this is clearly incorrect. The my other mails on the >>>>>>>>> earlier patch. >>>>>>>> Sorry for I didn't get your history, but looks from the patch >>>>>>>> itself, it is still a valid patch, isn't it? >>>>>>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL >>>>>>> when the fence is already signaled. >>>>>>> >>>>>>> So this patch could totally break userspace because it changes the >>>>>>> behavior when we try to sync to an already signaled fence. >>>>>> Ah, I got your meaning, how about attached patch? >>>>> Yeah something like this, but I would just give the >>>>> DRM_SYNCOBJ_CREATE_SIGNALED instead. >>>>> >>>>> I mean that's what this flag is good for isn't it? >>>> Yeah, I give a flag initally when creating patch, but as you know, there >>>> is a >>> swich case not be able to use that flag: >>>> case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD: >>>> fd = get_unused_fd_flags(O_CLOEXEC); >>>> if (fd < 0) { >>>> dma_fence_put(fence); >>>> return fd; >>>> } >>>> >>>> sync_file = sync_file_create(fence); >>>> dma_fence_put(fence); >>>> if (!sync_file) { >>>> put_unused_fd(fd); >>>> return -ENOMEM; >>>> } >>>> >>>> fd_install(fd, sync_file->file); >>>> info->out.handle = fd; >>>> return 0; >>>> >>>> So I change to stub fence instead. >>> Yeah, I've missed that case. Not sure if the sync_file can deal with a NULL >>> fence. >>> >>> We should then probably move the stub fence function into >>> dma_fence_stub.c under drivers/dma-buf to keep the stuff together. >> Yes, you wrap it to review first with your stub fence, we can do it >> separately first. >> >> -David >>>> -David >>>> >>>> I have not applied this patch. >>>> The issue was trying to address is when amdgpu_cs_ioctl() failed due to >>> low memory (ENOMEM) but userspace chose to proceed and called >>> amdgpu_cs_fence_to_handle_ioctl(). >>>> In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing >>>> NULL pointer dereference, this patch was to avoid that and system panic >>> But I understand now that its a valid case retuning NULL if fence was >>> already >>> signaled but need to handle case so avoid kernel panic. Seems David patch >>> should fix this, I will test it tomorrow. >>> >>> Mhm, but don't we bail out with an error if we ask for a failed command >>> submission? If not that sounds like a bug as well. >>> >>> Christian. >>> > Where do we do that? > I see error > [drm:amdgpu_cs_ioctl] *ERROR* amdgpu_cs_list_validate(validated) fai
Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
Thanks David , I did backport the drm patch to my kernel after that I am not getting -ENOMEM from amdgpu_cs_ioctl while running tests so have not been able to test patch to handle signaled fence. As this issue is hard to reproduce , will give some more try. But yes the problem is there and need to handle when fence is null that your patch seems to handle it correctly. -Deepak From: Zhou, David(ChunMing) Sent: Monday, November 26, 2018 6:40 PM To: Sharma, Deepak; Zhou, David(ChunMing); Koenig, Christian; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference Yeah, you need another drm patch as well when you apply my patch. Attached. -David On 2018年11月27日 08:40, Sharma, Deepak wrote: > > On 11/26/18 1:57 AM, Zhou, David(ChunMing) wrote: >> >>> -Original Message- >>> From: Christian König >>> Sent: Monday, November 26, 2018 5:23 PM >>> To: Sharma, Deepak ; Zhou, David(ChunMing) >>> ; Koenig, Christian ; >>> amd-gfx@lists.freedesktop.org >>> Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer >>> dereference >>> >>> Am 26.11.18 um 02:59 schrieb Sharma, Deepak: >>>> 在 2018/11/24 2:10, Koenig, Christian 写道: >>>>> Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing): >>>>>> 在 2018/11/23 21:30, Koenig, Christian 写道: >>>>>>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing): >>>>>>>> 在 2018/11/22 19:25, Christian König 写道: >>>>>>>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak: >>>>>>>>>> when returned fence is not valid mostly due to userspace ignored >>>>>>>>>> previous error causes NULL pointer dereference. >>>>>>>>> Again, this is clearly incorrect. The my other mails on the >>>>>>>>> earlier patch. >>>>>>>> Sorry for I didn't get your history, but looks from the patch >>>>>>>> itself, it is still a valid patch, isn't it? >>>>>>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL >>>>>>> when the fence is already signaled. >>>>>>> >>>>>>> So this patch could totally break userspace because it changes the >>>>>>> behavior when we try to sync to an already signaled fence. >>>>>> Ah, I got your meaning, how about attached patch? >>>>> Yeah something like this, but I would just give the >>>>> DRM_SYNCOBJ_CREATE_SIGNALED instead. >>>>> >>>>> I mean that's what this flag is good for isn't it? >>>> Yeah, I give a flag initally when creating patch, but as you know, there >>>> is a >>> swich case not be able to use that flag: >>>> case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD: >>>> fd = get_unused_fd_flags(O_CLOEXEC); >>>> if (fd < 0) { >>>> dma_fence_put(fence); >>>> return fd; >>>> } >>>> >>>> sync_file = sync_file_create(fence); >>>> dma_fence_put(fence); >>>> if (!sync_file) { >>>> put_unused_fd(fd); >>>> return -ENOMEM; >>>> } >>>> >>>> fd_install(fd, sync_file->file); >>>> info->out.handle = fd; >>>> return 0; >>>> >>>> So I change to stub fence instead. >>> Yeah, I've missed that case. Not sure if the sync_file can deal with a NULL >>> fence. >>> >>> We should then probably move the stub fence function into >>> dma_fence_stub.c under drivers/dma-buf to keep the stuff together. >> Yes, you wrap it to review first with your stub fence, we can do it >> separately first. >> >> -David >>>> -David >>>> >>>> I have not applied this patch. >>>> The issue was trying to address is when amdgpu_cs_ioctl() failed due to >>> low memory (ENOMEM) but userspace chose to proceed and called >>> amdgpu_cs_fence_to_handle_ioctl(). >>>> In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing >>>> NULL pointer dereference, this patch was to avoid that and system panic >>> But I understand now that its a valid case retuning NULL if fence was >>> already >>> signaled
Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
Am 27.11.18 um 01:40 schrieb Deepak Sharma: On 11/26/18 1:57 AM, Zhou, David(ChunMing) wrote: -Original Message- From: Christian König Sent: Monday, November 26, 2018 5:23 PM To: Sharma, Deepak ; Zhou, David(ChunMing) ; Koenig, Christian ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference Am 26.11.18 um 02:59 schrieb Sharma, Deepak: 在 2018/11/24 2:10, Koenig, Christian 写道: Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing): 在 2018/11/23 21:30, Koenig, Christian 写道: Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing): 在 2018/11/22 19:25, Christian König 写道: Am 22.11.18 um 07:56 schrieb Sharma, Deepak: when returned fence is not valid mostly due to userspace ignored previous error causes NULL pointer dereference. Again, this is clearly incorrect. The my other mails on the earlier patch. Sorry for I didn't get your history, but looks from the patch itself, it is still a valid patch, isn't it? No, the semantic of amdgpu_ctx_get_fence() is that we return NULL when the fence is already signaled. So this patch could totally break userspace because it changes the behavior when we try to sync to an already signaled fence. Ah, I got your meaning, how about attached patch? Yeah something like this, but I would just give the DRM_SYNCOBJ_CREATE_SIGNALED instead. I mean that's what this flag is good for isn't it? Yeah, I give a flag initally when creating patch, but as you know, there is a swich case not be able to use that flag: case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD: fd = get_unused_fd_flags(O_CLOEXEC); if (fd < 0) { dma_fence_put(fence); return fd; } sync_file = sync_file_create(fence); dma_fence_put(fence); if (!sync_file) { put_unused_fd(fd); return -ENOMEM; } fd_install(fd, sync_file->file); info->out.handle = fd; return 0; So I change to stub fence instead. Yeah, I've missed that case. Not sure if the sync_file can deal with a NULL fence. We should then probably move the stub fence function into dma_fence_stub.c under drivers/dma-buf to keep the stuff together. Yes, you wrap it to review first with your stub fence, we can do it separately first. -David -David I have not applied this patch. The issue was trying to address is when amdgpu_cs_ioctl() failed due to low memory (ENOMEM) but userspace chose to proceed and called amdgpu_cs_fence_to_handle_ioctl(). In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing NULL pointer dereference, this patch was to avoid that and system panic But I understand now that its a valid case retuning NULL if fence was already signaled but need to handle case so avoid kernel panic. Seems David patch should fix this, I will test it tomorrow. Mhm, but don't we bail out with an error if we ask for a failed command submission? If not that sounds like a bug as well. Christian. Where do we do that? I see error [drm:amdgpu_cs_ioctl] *ERROR* amdgpu_cs_list_validate(validated) failed. [drm:amdgpu_cs_ioctl] *ERROR* Not enough memory for command submission! BUG: unable to handle kernel NULL pointer dereference at 0008 Did some more debugging,dma_fence_is_array() is causing NULL pointer dereference call through sync_file_ioctl. Also I think changes in David patch cant be applied on amd-staging-drm-next, which all patches I should take to get it correctly? Mhm, what you say here actually doesn't make much sense. When the sequence number is invalid because the submission failed amdgpu_ctx_get_fence() returns an error: if (seq >= centity->sequence) { spin_unlock(>ring_lock); return ERR_PTR(-EINVAL); } This error is then handled in amdgpu_cs_fence_to_handle_ioctl(): fence = amdgpu_cs_get_fence(adev, filp, >in.fence); if (IS_ERR(fence)) return PTR_ERR(fence); So the error handling for this is actually in place. The only thing that could go wrong is that userspace sends a sequence number which is to small. The correct solution is to either set the flag as I suggested and make sync_file_create() capable of handling a NULL fence. Or we make the stub fence global like David suggested. Regards, Christian. -Deepak Christian. -David If that patch was applied then please revert it immediately. Christian. -David If you have already pushed the patch then please revert. Christian. Signed-off-by: Deepak Sharma --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 024dfbd87f11..14166cd8a12f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1403,6 +14
Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
Yeah, you need another drm patch as well when you apply my patch. Attached. -David On 2018年11月27日 08:40, Sharma, Deepak wrote: On 11/26/18 1:57 AM, Zhou, David(ChunMing) wrote: -Original Message- From: Christian König Sent: Monday, November 26, 2018 5:23 PM To: Sharma, Deepak ; Zhou, David(ChunMing) ; Koenig, Christian ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference Am 26.11.18 um 02:59 schrieb Sharma, Deepak: 在 2018/11/24 2:10, Koenig, Christian 写道: Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing): 在 2018/11/23 21:30, Koenig, Christian 写道: Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing): 在 2018/11/22 19:25, Christian König 写道: Am 22.11.18 um 07:56 schrieb Sharma, Deepak: when returned fence is not valid mostly due to userspace ignored previous error causes NULL pointer dereference. Again, this is clearly incorrect. The my other mails on the earlier patch. Sorry for I didn't get your history, but looks from the patch itself, it is still a valid patch, isn't it? No, the semantic of amdgpu_ctx_get_fence() is that we return NULL when the fence is already signaled. So this patch could totally break userspace because it changes the behavior when we try to sync to an already signaled fence. Ah, I got your meaning, how about attached patch? Yeah something like this, but I would just give the DRM_SYNCOBJ_CREATE_SIGNALED instead. I mean that's what this flag is good for isn't it? Yeah, I give a flag initally when creating patch, but as you know, there is a swich case not be able to use that flag: case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD: fd = get_unused_fd_flags(O_CLOEXEC); if (fd < 0) { dma_fence_put(fence); return fd; } sync_file = sync_file_create(fence); dma_fence_put(fence); if (!sync_file) { put_unused_fd(fd); return -ENOMEM; } fd_install(fd, sync_file->file); info->out.handle = fd; return 0; So I change to stub fence instead. Yeah, I've missed that case. Not sure if the sync_file can deal with a NULL fence. We should then probably move the stub fence function into dma_fence_stub.c under drivers/dma-buf to keep the stuff together. Yes, you wrap it to review first with your stub fence, we can do it separately first. -David -David I have not applied this patch. The issue was trying to address is when amdgpu_cs_ioctl() failed due to low memory (ENOMEM) but userspace chose to proceed and called amdgpu_cs_fence_to_handle_ioctl(). In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing NULL pointer dereference, this patch was to avoid that and system panic But I understand now that its a valid case retuning NULL if fence was already signaled but need to handle case so avoid kernel panic. Seems David patch should fix this, I will test it tomorrow. Mhm, but don't we bail out with an error if we ask for a failed command submission? If not that sounds like a bug as well. Christian. Where do we do that? I see error [drm:amdgpu_cs_ioctl] *ERROR* amdgpu_cs_list_validate(validated) failed. [drm:amdgpu_cs_ioctl] *ERROR* Not enough memory for command submission! BUG: unable to handle kernel NULL pointer dereference at 0008 Did some more debugging,dma_fence_is_array() is causing NULL pointer dereference call through sync_file_ioctl. Also I think changes in David patch cant be applied on amd-staging-drm-next, which all patches I should take to get it correctly? -Deepak Christian. -David If that patch was applied then please revert it immediately. Christian. -David If you have already pushed the patch then please revert. Christian. Signed-off-by: Deepak Sharma --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 024dfbd87f11..14166cd8a12f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1403,6 +1403,8 @@ static struct dma_fence *amdgpu_cs_get_fence(struct amdgpu_device *adev, fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no); amdgpu_ctx_put(ctx); + if(!fence) + return ERR_PTR(-EINVAL); return fence; } ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx Received: from SN1PR12MB0509.namprd12.prod.outlook.com (2603:10b6:a02:a8::31) by BY1PR12MB0502.namprd12.prod.outlook.com with HTTPS via BYAPR03CA0018.NAMPRD03.PROD.OUTLOOK.COM; Thu, 15 Nov 2018 11:12:57 + Received: from MWHPR12CA0057.namprd12.prod.outlook.com (2603:10b6:300:103::19) by SN1PR12MB0509.namprd12.prod.outlook.com (2a01:111:e
Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
On 11/26/18 1:57 AM, Zhou, David(ChunMing) wrote: > > >> -Original Message- >> From: Christian König >> Sent: Monday, November 26, 2018 5:23 PM >> To: Sharma, Deepak ; Zhou, David(ChunMing) >> ; Koenig, Christian ; >> amd-gfx@lists.freedesktop.org >> Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer >> dereference >> >> Am 26.11.18 um 02:59 schrieb Sharma, Deepak: >>> >>> 在 2018/11/24 2:10, Koenig, Christian 写道: >>>> Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing): >>>>> 在 2018/11/23 21:30, Koenig, Christian 写道: >>>>>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing): >>>>>>> 在 2018/11/22 19:25, Christian König 写道: >>>>>>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak: >>>>>>>>> when returned fence is not valid mostly due to userspace ignored >>>>>>>>> previous error causes NULL pointer dereference. >>>>>>>> Again, this is clearly incorrect. The my other mails on the >>>>>>>> earlier patch. >>>>>>> Sorry for I didn't get your history, but looks from the patch >>>>>>> itself, it is still a valid patch, isn't it? >>>>>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL >>>>>> when the fence is already signaled. >>>>>> >>>>>> So this patch could totally break userspace because it changes the >>>>>> behavior when we try to sync to an already signaled fence. >>>>> Ah, I got your meaning, how about attached patch? >>>> Yeah something like this, but I would just give the >>>> DRM_SYNCOBJ_CREATE_SIGNALED instead. >>>> >>>> I mean that's what this flag is good for isn't it? >>> Yeah, I give a flag initally when creating patch, but as you know, there is >>> a >> swich case not be able to use that flag: >>> >>> case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD: >>> fd = get_unused_fd_flags(O_CLOEXEC); >>> if (fd < 0) { >>> dma_fence_put(fence); >>> return fd; >>> } >>> >>> sync_file = sync_file_create(fence); >>> dma_fence_put(fence); >>> if (!sync_file) { >>> put_unused_fd(fd); >>> return -ENOMEM; >>> } >>> >>> fd_install(fd, sync_file->file); >>> info->out.handle = fd; >>> return 0; >>> >>> So I change to stub fence instead. >> >> Yeah, I've missed that case. Not sure if the sync_file can deal with a NULL >> fence. >> >> We should then probably move the stub fence function into >> dma_fence_stub.c under drivers/dma-buf to keep the stuff together. > > Yes, you wrap it to review first with your stub fence, we can do it > separately first. > > -David >> >>> >>> -David >>> >>> I have not applied this patch. >>> The issue was trying to address is when amdgpu_cs_ioctl() failed due to >> low memory (ENOMEM) but userspace chose to proceed and called >> amdgpu_cs_fence_to_handle_ioctl(). >>> In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing >>> NULL pointer dereference, this patch was to avoid that and system panic >> But I understand now that its a valid case retuning NULL if fence was already >> signaled but need to handle case so avoid kernel panic. Seems David patch >> should fix this, I will test it tomorrow. >> >> Mhm, but don't we bail out with an error if we ask for a failed command >> submission? If not that sounds like a bug as well. >> >> Christian. >> Where do we do that? I see error [drm:amdgpu_cs_ioctl] *ERROR* amdgpu_cs_list_validate(validated) failed. [drm:amdgpu_cs_ioctl] *ERROR* Not enough memory for command submission! BUG: unable to handle kernel NULL pointer dereference at 0008 Did some more debugging,dma_fence_is_array() is causing NULL pointer dereference call through sync_file_ioctl. Also I think changes in David patch cant be applied on amd-staging-drm-next, which all patches I should take to get it correctly? >>> >>> -Deepak >>>> Christian. >>>> >>>>> -David >>>>>> If that patch was applied then please revert it immediately. >>>>>> >>>>>> Christian. >>>>>&g
RE: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
> -Original Message- > From: Christian König > Sent: Monday, November 26, 2018 5:23 PM > To: Sharma, Deepak ; Zhou, David(ChunMing) > ; Koenig, Christian ; > amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer > dereference > > Am 26.11.18 um 02:59 schrieb Sharma, Deepak: > > > > 在 2018/11/24 2:10, Koenig, Christian 写道: > >> Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing): > >>> 在 2018/11/23 21:30, Koenig, Christian 写道: > >>>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing): > >>>>> 在 2018/11/22 19:25, Christian König 写道: > >>>>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak: > >>>>>>> when returned fence is not valid mostly due to userspace ignored > >>>>>>> previous error causes NULL pointer dereference. > >>>>>> Again, this is clearly incorrect. The my other mails on the > >>>>>> earlier patch. > >>>>> Sorry for I didn't get your history, but looks from the patch > >>>>> itself, it is still a valid patch, isn't it? > >>>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL > >>>> when the fence is already signaled. > >>>> > >>>> So this patch could totally break userspace because it changes the > >>>> behavior when we try to sync to an already signaled fence. > >>> Ah, I got your meaning, how about attached patch? > >> Yeah something like this, but I would just give the > >> DRM_SYNCOBJ_CREATE_SIGNALED instead. > >> > >> I mean that's what this flag is good for isn't it? > > Yeah, I give a flag initally when creating patch, but as you know, there is > > a > swich case not be able to use that flag: > > > > case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD: > > fd = get_unused_fd_flags(O_CLOEXEC); > > if (fd < 0) { > > dma_fence_put(fence); > > return fd; > > } > > > > sync_file = sync_file_create(fence); > > dma_fence_put(fence); > > if (!sync_file) { > > put_unused_fd(fd); > > return -ENOMEM; > > } > > > > fd_install(fd, sync_file->file); > > info->out.handle = fd; > > return 0; > > > > So I change to stub fence instead. > > Yeah, I've missed that case. Not sure if the sync_file can deal with a NULL > fence. > > We should then probably move the stub fence function into > dma_fence_stub.c under drivers/dma-buf to keep the stuff together. Yes, you wrap it to review first with your stub fence, we can do it separately first. -David > > > > > -David > > > > I have not applied this patch. > > The issue was trying to address is when amdgpu_cs_ioctl() failed due to > low memory (ENOMEM) but userspace chose to proceed and called > amdgpu_cs_fence_to_handle_ioctl(). > > In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing > > NULL pointer dereference, this patch was to avoid that and system panic > But I understand now that its a valid case retuning NULL if fence was already > signaled but need to handle case so avoid kernel panic. Seems David patch > should fix this, I will test it tomorrow. > > Mhm, but don't we bail out with an error if we ask for a failed command > submission? If not that sounds like a bug as well. > > Christian. > > > > > -Deepak > >> Christian. > >> > >>> -David > >>>> If that patch was applied then please revert it immediately. > >>>> > >>>> Christian. > >>>> > >>>>> -David > >>>>>> If you have already pushed the patch then please revert. > >>>>>> > >>>>>> Christian. > >>>>>> > >>>>>>> Signed-off-by: Deepak Sharma > >>>>>>> --- > >>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ > >>>>>>> 1 file changed, 2 insertions(+) > >>>>>>> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > >>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > >>>>>>> index 024dfbd87f11..14166cd8a12f 100644 > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > >>>>>>> @@ -1403,6 +1403,8 @@ static struct dma_fence > >>>>>>> *amdgpu_cs_get_fence(struct amdgpu_device *adev, > >>>>>>> fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no); > >>>>>>> amdgpu_ctx_put(ctx); > >>>>>>> + if(!fence) > >>>>>>> + return ERR_PTR(-EINVAL); > >>>>>>> return fence; > >>>>>>> } > > ___ > > amd-gfx mailing list > > amd-gfx@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
Am 26.11.18 um 02:59 schrieb Sharma, Deepak: 在 2018/11/24 2:10, Koenig, Christian 写道: Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing): 在 2018/11/23 21:30, Koenig, Christian 写道: Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing): 在 2018/11/22 19:25, Christian König 写道: Am 22.11.18 um 07:56 schrieb Sharma, Deepak: when returned fence is not valid mostly due to userspace ignored previous error causes NULL pointer dereference. Again, this is clearly incorrect. The my other mails on the earlier patch. Sorry for I didn't get your history, but looks from the patch itself, it is still a valid patch, isn't it? No, the semantic of amdgpu_ctx_get_fence() is that we return NULL when the fence is already signaled. So this patch could totally break userspace because it changes the behavior when we try to sync to an already signaled fence. Ah, I got your meaning, how about attached patch? Yeah something like this, but I would just give the DRM_SYNCOBJ_CREATE_SIGNALED instead. I mean that's what this flag is good for isn't it? Yeah, I give a flag initally when creating patch, but as you know, there is a swich case not be able to use that flag: case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD: fd = get_unused_fd_flags(O_CLOEXEC); if (fd < 0) { dma_fence_put(fence); return fd; } sync_file = sync_file_create(fence); dma_fence_put(fence); if (!sync_file) { put_unused_fd(fd); return -ENOMEM; } fd_install(fd, sync_file->file); info->out.handle = fd; return 0; So I change to stub fence instead. Yeah, I've missed that case. Not sure if the sync_file can deal with a NULL fence. We should then probably move the stub fence function into dma_fence_stub.c under drivers/dma-buf to keep the stuff together. -David I have not applied this patch. The issue was trying to address is when amdgpu_cs_ioctl() failed due to low memory (ENOMEM) but userspace chose to proceed and called amdgpu_cs_fence_to_handle_ioctl(). In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing NULL pointer dereference, this patch was to avoid that and system panic But I understand now that its a valid case retuning NULL if fence was already signaled but need to handle case so avoid kernel panic. Seems David patch should fix this, I will test it tomorrow. Mhm, but don't we bail out with an error if we ask for a failed command submission? If not that sounds like a bug as well. Christian. -Deepak Christian. -David If that patch was applied then please revert it immediately. Christian. -David If you have already pushed the patch then please revert. Christian. Signed-off-by: Deepak Sharma --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 024dfbd87f11..14166cd8a12f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1403,6 +1403,8 @@ static struct dma_fence *amdgpu_cs_get_fence(struct amdgpu_device *adev, fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no); amdgpu_ctx_put(ctx); + if(!fence) + return ERR_PTR(-EINVAL); return fence; } ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
在 2018/11/24 2:10, Koenig, Christian 写道: > Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing): >> 在 2018/11/23 21:30, Koenig, Christian 写道: >>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing): 在 2018/11/22 19:25, Christian König 写道: > Am 22.11.18 um 07:56 schrieb Sharma, Deepak: >> when returned fence is not valid mostly due to userspace ignored >> previous error causes NULL pointer dereference. > Again, this is clearly incorrect. The my other mails on the > earlier patch. Sorry for I didn't get your history, but looks from the patch itself, it is still a valid patch, isn't it? >>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL >>> when the fence is already signaled. >>> >>> So this patch could totally break userspace because it changes the >>> behavior when we try to sync to an already signaled fence. >> Ah, I got your meaning, how about attached patch? > Yeah something like this, but I would just give the > DRM_SYNCOBJ_CREATE_SIGNALED instead. > > I mean that's what this flag is good for isn't it? Yeah, I give a flag initally when creating patch, but as you know, there is a swich case not be able to use that flag: case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD: fd = get_unused_fd_flags(O_CLOEXEC); if (fd < 0) { dma_fence_put(fence); return fd; } sync_file = sync_file_create(fence); dma_fence_put(fence); if (!sync_file) { put_unused_fd(fd); return -ENOMEM; } fd_install(fd, sync_file->file); info->out.handle = fd; return 0; So I change to stub fence instead. -David I have not applied this patch. The issue was trying to address is when amdgpu_cs_ioctl() failed due to low memory (ENOMEM) but userspace chose to proceed and called amdgpu_cs_fence_to_handle_ioctl(). In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing NULL pointer dereference, this patch was to avoid that and system panic But I understand now that its a valid case retuning NULL if fence was already signaled but need to handle case so avoid kernel panic. Seems David patch should fix this, I will test it tomorrow. -Deepak > > Christian. > >> -David >>> If that patch was applied then please revert it immediately. >>> >>> Christian. >>> -David > If you have already pushed the patch then please revert. > > Christian. > >> Signed-off-by: Deepak Sharma >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> index 024dfbd87f11..14166cd8a12f 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> @@ -1403,6 +1403,8 @@ static struct dma_fence >> *amdgpu_cs_get_fence(struct amdgpu_device *adev, >> fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no); >> amdgpu_ctx_put(ctx); >> + if(!fence) >> + return ERR_PTR(-EINVAL); >> return fence; >> } ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
在 2018/11/24 2:10, Koenig, Christian 写道: > Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing): >> 在 2018/11/23 21:30, Koenig, Christian 写道: >>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing): 在 2018/11/22 19:25, Christian König 写道: > Am 22.11.18 um 07:56 schrieb Sharma, Deepak: >> when returned fence is not valid mostly due to userspace ignored >> previous error causes NULL pointer dereference. > Again, this is clearly incorrect. The my other mails on the earlier > patch. Sorry for I didn't get your history, but looks from the patch itself, it is still a valid patch, isn't it? >>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL when >>> the fence is already signaled. >>> >>> So this patch could totally break userspace because it changes the >>> behavior when we try to sync to an already signaled fence. >> Ah, I got your meaning, how about attached patch? > Yeah something like this, but I would just give the > DRM_SYNCOBJ_CREATE_SIGNALED instead. > > I mean that's what this flag is good for isn't it? Yeah, I give a flag initally when creating patch, but as you know, there is a swich case not be able to use that flag: case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD: fd = get_unused_fd_flags(O_CLOEXEC); if (fd < 0) { dma_fence_put(fence); return fd; } sync_file = sync_file_create(fence); dma_fence_put(fence); if (!sync_file) { put_unused_fd(fd); return -ENOMEM; } fd_install(fd, sync_file->file); info->out.handle = fd; return 0; So I change to stub fence instead. -David > > Christian. > >> -David >>> If that patch was applied then please revert it immediately. >>> >>> Christian. >>> -David > If you have already pushed the patch then please revert. > > Christian. > >> Signed-off-by: Deepak Sharma >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> index 024dfbd87f11..14166cd8a12f 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> @@ -1403,6 +1403,8 @@ static struct dma_fence >> *amdgpu_cs_get_fence(struct amdgpu_device *adev, >> fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no); >> amdgpu_ctx_put(ctx); >> + if(!fence) >> + return ERR_PTR(-EINVAL); >> return fence; >> } ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing): > > 在 2018/11/23 21:30, Koenig, Christian 写道: >> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing): >>> 在 2018/11/22 19:25, Christian König 写道: Am 22.11.18 um 07:56 schrieb Sharma, Deepak: > when returned fence is not valid mostly due to userspace ignored > previous error causes NULL pointer dereference. Again, this is clearly incorrect. The my other mails on the earlier patch. >>> Sorry for I didn't get your history, but looks from the patch itself, it >>> is still a valid patch, isn't it? >> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL when >> the fence is already signaled. >> >> So this patch could totally break userspace because it changes the >> behavior when we try to sync to an already signaled fence. > Ah, I got your meaning, how about attached patch? Yeah something like this, but I would just give the DRM_SYNCOBJ_CREATE_SIGNALED instead. I mean that's what this flag is good for isn't it? Christian. > > -David >> If that patch was applied then please revert it immediately. >> >> Christian. >> >>> -David If you have already pushed the patch then please revert. Christian. > Signed-off-by: Deepak Sharma > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index 024dfbd87f11..14166cd8a12f 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -1403,6 +1403,8 @@ static struct dma_fence > *amdgpu_cs_get_fence(struct amdgpu_device *adev, > fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no); > amdgpu_ctx_put(ctx); > + if(!fence) > + return ERR_PTR(-EINVAL); > return fence; > } ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
在 2018/11/23 21:30, Koenig, Christian 写道: > Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing): >> 在 2018/11/22 19:25, Christian König 写道: >>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak: when returned fence is not valid mostly due to userspace ignored previous error causes NULL pointer dereference. >>> Again, this is clearly incorrect. The my other mails on the earlier >>> patch. >> Sorry for I didn't get your history, but looks from the patch itself, it >> is still a valid patch, isn't it? > No, the semantic of amdgpu_ctx_get_fence() is that we return NULL when > the fence is already signaled. > > So this patch could totally break userspace because it changes the > behavior when we try to sync to an already signaled fence. Ah, I got your meaning, how about attached patch? -David > > If that patch was applied then please revert it immediately. > > Christian. > >> -David >>> If you have already pushed the patch then please revert. >>> >>> Christian. >>> Signed-off-by: Deepak Sharma --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 024dfbd87f11..14166cd8a12f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1403,6 +1403,8 @@ static struct dma_fence *amdgpu_cs_get_fence(struct amdgpu_device *adev, fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no); amdgpu_ctx_put(ctx); + if(!fence) + return ERR_PTR(-EINVAL); return fence; } From 3640a18c31e7b786129286615fcdf397e1142451 Mon Sep 17 00:00:00 2001 From: Chunming Zhou Date: Fri, 23 Nov 2018 22:05:19 +0800 Subject: [PATCH] drm/amdgpu: fix signaled fence isn't handled MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Chunming Zhou Cc: Sharma, Deepak CC: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 3 +++ drivers/gpu/drm/drm_syncobj.c | 1 + include/drm/drm_syncobj.h | 1 + 3 files changed, 5 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 6a823b58b3b8..e960f9864e9f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1505,6 +1505,9 @@ int amdgpu_cs_fence_to_handle_ioctl(struct drm_device *dev, void *data, if (IS_ERR(fence)) return PTR_ERR(fence); +/* that means fence was signaled */ + if (!fence) + fence = drm_syncobj_get_stub_fence(); switch (info->in.what) { case AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ: r = drm_syncobj_create(, 0, fence); diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 5f2df10e51c3..e5621f80d501 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -97,6 +97,7 @@ struct dma_fence *drm_syncobj_get_stub_fence(void) return dma_fence_get(_fence); } +EXPORT_SYMBOL(drm_syncobj_get_stub_fence); /** * drm_syncobj_find - lookup and reference a sync object. * @file_private: drm file private pointer diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h index 29244cbcd05e..93e9e9b159ab 100644 --- a/include/drm/drm_syncobj.h +++ b/include/drm/drm_syncobj.h @@ -147,5 +147,6 @@ int drm_syncobj_get_handle(struct drm_file *file_private, int drm_syncobj_get_fd(struct drm_syncobj *syncobj, int *p_fd); int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point, u64 flags, struct dma_fence **fence); +struct dma_fence *drm_syncobj_get_stub_fence(void); #endif -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing): > > 在 2018/11/22 19:25, Christian König 写道: >> Am 22.11.18 um 07:56 schrieb Sharma, Deepak: >>> when returned fence is not valid mostly due to userspace ignored >>> previous error causes NULL pointer dereference. >> Again, this is clearly incorrect. The my other mails on the earlier >> patch. > Sorry for I didn't get your history, but looks from the patch itself, it > is still a valid patch, isn't it? No, the semantic of amdgpu_ctx_get_fence() is that we return NULL when the fence is already signaled. So this patch could totally break userspace because it changes the behavior when we try to sync to an already signaled fence. If that patch was applied then please revert it immediately. Christian. > > -David >> If you have already pushed the patch then please revert. >> >> Christian. >> >>> Signed-off-by: Deepak Sharma >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> index 024dfbd87f11..14166cd8a12f 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> @@ -1403,6 +1403,8 @@ static struct dma_fence >>> *amdgpu_cs_get_fence(struct amdgpu_device *adev, >>> fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no); >>> amdgpu_ctx_put(ctx); >>> + if(!fence) >>> + return ERR_PTR(-EINVAL); >>> return fence; >>> } ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
Am 22.11.18 um 07:56 schrieb Sharma, Deepak: when returned fence is not valid mostly due to userspace ignored previous error causes NULL pointer dereference. Again, this is clearly incorrect. The my other mails on the earlier patch. If you have already pushed the patch then please revert. Christian. Signed-off-by: Deepak Sharma --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 024dfbd87f11..14166cd8a12f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1403,6 +1403,8 @@ static struct dma_fence *amdgpu_cs_get_fence(struct amdgpu_device *adev, fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no); amdgpu_ctx_put(ctx); + if(!fence) + return ERR_PTR(-EINVAL); return fence; } ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
On 2018年11月22日 14:56, Sharma, Deepak wrote: when returned fence is not valid mostly due to userspace ignored previous error causes NULL pointer dereference. Signed-off-by: Deepak Sharma Reviewed-by: Chunming Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 024dfbd87f11..14166cd8a12f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1403,6 +1403,8 @@ static struct dma_fence *amdgpu_cs_get_fence(struct amdgpu_device *adev, fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no); amdgpu_ctx_put(ctx); + if(!fence) + return ERR_PTR(-EINVAL); return fence; } ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
when returned fence is not valid mostly due to userspace ignored previous error causes NULL pointer dereference. Signed-off-by: Deepak Sharma --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 024dfbd87f11..14166cd8a12f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1403,6 +1403,8 @@ static struct dma_fence *amdgpu_cs_get_fence(struct amdgpu_device *adev, fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no); amdgpu_ctx_put(ctx); + if(!fence) + return ERR_PTR(-EINVAL); return fence; } -- 2.15.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx