Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-18 Thread Andrey Grodzovsky
age- From: Grodzovsky, Andrey Sent: Friday, November 15, 2019 6:14 AM To: Koenig, Christian ; Deng, Emily ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr Attached. Emily - can you give it a try ? Andrey On 11/14/19 3:12 AM, Christian Kö

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-18 Thread Christian König
gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr Attached. Emily - can you give it a try ? Andrey On 11/14/19 3:12 AM, Christian König wrote: What about instead of peeking at the job to actually remove it from ring_mirror_list right there, A

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-18 Thread Andrey Grodzovsky
rom: Grodzovsky, Andrey Sent: Friday, November 15, 2019 6:14 AM To: Koenig, Christian ; Deng, Emily ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr Attached. Emily - can you give it a try ? Andrey On 11/14/19 3:12 AM, Christian König wrote: What ab

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-18 Thread Christian König
with another issue, maybe will try next week. Best wishes Emily Deng -Original Message- From: Grodzovsky, Andrey Sent: Friday, November 15, 2019 6:14 AM To: Koenig, Christian ; Deng, Emily ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-18 Thread Andrey Grodzovsky
wishes Emily Deng -Original Message- From: Grodzovsky, Andrey Sent: Friday, November 15, 2019 6:14 AM To: Koenig, Christian ; Deng, Emily ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr Attached. Emily - can you give it a try ? Andrey

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-18 Thread Christian König
, Emily ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr Attached. Emily - can you give it a try ? Andrey On 11/14/19 3:12 AM, Christian König wrote: What about instead of peeking at the job to actually remove it from ring_mirror_list right

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-18 Thread Andrey Grodzovsky
:14 AM To: Koenig, Christian ; Deng, Emily ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr Attached. Emily - can you give it a try ? Andrey On 11/14/19 3:12 AM, Christian König wrote: What about instead of peeking at the job to actually remove

RE: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-14 Thread Deng, Emily
jobs:begin,tid:2262, >>>>>>>>> pid:2262 >>>>>>>>> Nov 12 12:58:20 ubuntu-drop-August-2018-rc2-gpu0-vf02 kernel: >>>>>>>>> [11380.695107] Emily:drm_sched_cleanup_jobs,tid:2262, pid:2262 >>>>

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-14 Thread Andrey Grodzovsky
zovsky, Andrey >Sent: Tuesday, November 12, 2019 11:28 AM >To: Koenig, Christian ; Deng, Emily >; amd-gfx@lists.freedesktop.org >Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr > >Thinking more about this claim - we assume here that if cancel_delayed_work &

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-14 Thread Andrey Grodzovsky
12, 2019 11:28 AM >To: Koenig, Christian ; Deng, Emily >; amd-gfx@lists.freedesktop.org >Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr > >Thinking more about this claim - we assume here that if cancel_delayed_work >returned true it guarantees that timeo

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-14 Thread Christian König
elayed_work_sync to flush the timeout work as timeout work itself >waits for schedule thread  to be parked again when calling park_thread. > >Andrey > > >From: amd-gfx on behalf of >Koenig, Christian >Sent: 08 November 2019 0

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-13 Thread Andrey Grodzovsky
_cb,Process information: process  pid 0 thread  pid 0, s_job:f086ec84, tid:2262, pid:2262 >-Original Message- >From: Grodzovsky, Andrey >Sent: Tuesday, November 12, 2019 11:28 AM >To: Koenig, Christian ; Deng, Emily >; amd-gfx@lists.freedesktop.org >Subject:

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-13 Thread Christian König
_free_cb,Process information: process  pid 0 thread  pid 0, s_job:f086ec84, tid:2262, pid:2262 >-Original Message- >From: Grodzovsky, Andrey >Sent: Tuesday, November 12, 2019 11:28 AM >To: Koenig, Christian ; Deng, Emily >; amd-gfx@lists.freedesktop.org >Subject:

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-13 Thread Andrey Grodzovsky
pu_job_free_cb,Process information: process  pid 0 thread  pid 0, s_job:f086ec84, tid:2262, pid:2262 >-Original Message- >From: Grodzovsky, Andrey >Sent: Tuesday, November 12, 2019 11:28 AM >To: Koenig, Christian ; Deng, Emily >; amd-gfx@lists.freedesktop.org >

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-12 Thread Christian König
k to >cancel_delayed_work_sync to flush the timeout work as timeout work itself >waits for schedule thread  to be parked again when calling park_thread. > >Andrey > > >From: amd-gfx on behalf of >Koenig, Christian >Sent: 08 November 2019 05:3

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-12 Thread Andrey Grodzovsky
62 >-Original Message- >From: Grodzovsky, Andrey >Sent: Tuesday, November 12, 2019 11:28 AM >To: Koenig, Christian ; Deng, Emily >; amd-gfx@lists.freedesktop.org >Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr > >Thinking more about this claim

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-12 Thread Christian König
t;Sent: Tuesday, November 12, 2019 11:28 AM >To: Koenig, Christian ; Deng, Emily >; amd-gfx@lists.freedesktop.org >Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr > >Thinking more about this claim - we assume here that if cancel_delayed_work >returned

RE: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-11 Thread Deng, Emily
egards, >>> Christian. >>> >>> Am 08.11.19 um 11:22 schrieb Deng, Emily: >>>> Hi Chrisitan, >>>> No, I am with the new branch and also has the patch. Even it >>>> are freed by >>> main scheduler, how we could avo

RE: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-11 Thread Deng, Emily
age- >From: Grodzovsky, Andrey >Sent: Tuesday, November 12, 2019 5:35 AM >To: Deng, Emily ; Koenig, Christian >; amd-gfx@lists.freedesktop.org >Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr > >Emily - is there a particular scenario to reproduce this ? I

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-11 Thread Grodzovsky, Andrey
ead. Andrey From: amd-gfx on behalf of Koenig, Christian Sent: 08 November 2019 05:35:18 To: Deng, Emily; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr Hi Emily, exactly that can't happen. See h

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-11 Thread Andrey Grodzovsky
To: Grodzovsky, Andrey ; Koenig, Christian ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr Hi Andrey, I don’t think your patch will help for this. As it will may call kthread_should_park in drm_sched_cleanup_jobs first, and then call kcl_kthread_park

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-11 Thread Andrey Grodzovsky
3:01 AM To: Koenig, Christian ; Deng, Emily ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr On 11/8/19 5:35 AM, Koenig, Christian wrote: Hi Emily, exactly that can't happen. See here:     /* Don't destroy jobs while the timeout

RE: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-11 Thread Deng, Emily
m: amd-gfx On Behalf Of Deng, >Emily >Sent: Monday, November 11, 2019 3:19 PM >To: Grodzovsky, Andrey ; Koenig, Christian >; amd-gfx@lists.freedesktop.org >Subject: RE: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr > >Hi Andrey, >I don’t think your patch will help

RE: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-10 Thread Deng, Emily
zovsky, Andrey >Sent: Saturday, November 9, 2019 3:01 AM >To: Koenig, Christian ; Deng, Emily >; amd-gfx@lists.freedesktop.org >Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr > > >On 11/8/19 5:35 AM, Koenig, Christian wrote: >> Hi Emily, >>

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-08 Thread Grodzovsky, Andrey
> -Original Message- >> From: Koenig, Christian >> Sent: Friday, November 8, 2019 6:35 PM >> To: Deng, Emily ; amd-gfx@lists.freedesktop.org >> Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr >> >> Hi Emily, >> >> exact

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-08 Thread Grodzovsky, Andrey
Original Message- >>> From: Koenig, Christian >>> Sent: Friday, November 8, 2019 6:26 PM >>> To: Deng, Emily ; amd-gfx@lists.freedesktop.org >>> Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr >>> >>> Hi Emily, >>>

RE: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-08 Thread Deng, Emily
t: Friday, November 8, 2019 6:35 PM >To: Deng, Emily ; amd-gfx@lists.freedesktop.org >Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr > >Hi Emily, > >exactly that can't happen. See here: > >>     /* Don't destroy jobs while the timeout worker is runn

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-08 Thread Koenig, Christian
while in > amdgpu_device_gpu_recover, and before calling drm_sched_stop. > > Best wishes > Emily Deng > > > >> -Original Message- >> From: Koenig, Christian >> Sent: Friday, November 8, 2019 6:26 PM >> To: Deng, Emily ; amd-gfx@lists.freedesktop.

RE: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-08 Thread Deng, Emily
Christian >Sent: Friday, November 8, 2019 6:26 PM >To: Deng, Emily ; amd-gfx@lists.freedesktop.org >Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr > >Hi Emily, > >well who is calling amdgpu_device_gpu_recover() in this case? > >When it's not the sche

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-08 Thread Koenig, Christian
day, November 8, 2019 6:15 PM >> To: Deng, Emily ; amd-gfx@lists.freedesktop.org >> Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr >> >> Hi Emily, >> >> in this case you are on an old code branch. >> >> Jobs are freed now by t

RE: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-08 Thread Deng, Emily
>Sent: Friday, November 8, 2019 6:15 PM >To: Deng, Emily ; amd-gfx@lists.freedesktop.org >Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr > >Hi Emily, > >in this case you are on an old code branch. > >Jobs are freed now by the main scheduler threa

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-08 Thread Koenig, Christian
815374] ? kthread_create_worker_on_cpu+0x70/0x70 > [ 449.815799] ret_from_fork+0x35/0x40 > >> -Original Message- >> From: Koenig, Christian >> Sent: Friday, November 8, 2019 5:43 PM >> To: Deng, Emily ; amd-gfx@lists.freedesktop.org >> Subject: Re: [PATCH]

RE: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-08 Thread Deng, Emily
ovember 8, 2019 5:43 PM >To: Deng, Emily ; amd-gfx@lists.freedesktop.org >Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr > >Am 08.11.19 um 10:39 schrieb Deng, Emily: >> Sorry, please take your time. > >Have you seen my other response a bit below? >

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-08 Thread Koenig, Christian
issues. Regards, Christian. > > Best wishes > Emily Deng > > > >> -Original Message- >> From: Koenig, Christian >> Sent: Friday, November 8, 2019 5:08 PM >> To: Deng, Emily ; amd-gfx@lists.freedesktop.org >> Subject: Re: [PATCH] drm/amdgpu: Fix

RE: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-08 Thread Deng, Emily
Sorry, please take your time. Best wishes Emily Deng >-Original Message- >From: Koenig, Christian >Sent: Friday, November 8, 2019 5:08 PM >To: Deng, Emily ; amd-gfx@lists.freedesktop.org >Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr > >

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-08 Thread Koenig, Christian
8, 2019 10:56 AM >> To: Koenig, Christian ; amd- >> g...@lists.freedesktop.org >> Subject: RE: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr >> >>> -Original Message- >>> From: Christian König >>> Sent: Thursday, November 7, 2019 7

RE: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-08 Thread Deng, Emily
Ping. Best wishes Emily Deng >-Original Message- >From: amd-gfx On Behalf Of Deng, >Emily >Sent: Friday, November 8, 2019 10:56 AM >To: Koenig, Christian ; amd- >g...@lists.freedesktop.org >Subject: RE: [PATCH] drm/amdgpu: Fix the null pointer issue for

RE: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-07 Thread Deng, Emily
>-Original Message- >From: Christian König >Sent: Thursday, November 7, 2019 7:28 PM >To: Deng, Emily ; amd-gfx@lists.freedesktop.org >Subject: Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr > >Am 07.11.19 um 11:25 schrieb Emily Deng: >> When

Re: [PATCH] drm/amdgpu: Fix the null pointer issue for tdr

2019-11-07 Thread Christian König
Am 07.11.19 um 11:25 schrieb Emily Deng: When the job is already signaled, the s_fence is freed. Then it will has null pointer in amdgpu_device_gpu_recover. NAK, the s_fence is only set to NULL when the job is destroyed. See drm_sched_job_cleanup(). When you see a job without an s_fence